Method for Selecting a Treatment for Non-Small Cell Lung Cancer Using Gene Expression Profiles

ABSTRACT

The present invention identifies and quantifies changes in gene expression associated with non-small cell lung cancer NSCLC by examining gene expression in tissue from normal lung and diseased lung. The present invention also identifies and quantifies expression profiles which serve as useful diagnostic markers as well as markers that are useful to monitor disease states, disease progression, drug toxicity, drug efficacy and drug metabolism.

CROSS-REFERENCE TO RELATED APPLICATIONS

This is a divisional application of U.S. patent application Ser. No.10/508,932, filed on Sep. 24, 2004, and issued as U.S. Pat. No.8,722,331, issued May 13, 2014, which is the National Stage Entry of PCTApplication Number PCT/US2003/09428, filed on Mar. 27, 2003. PCTApplication Number PCT/US2003/09428 claims priority to U.S. ProvisionalApplication No. 60/368,409, filed on Mar. 28, 2002, and U.S. ProvisionalApplication No. 60/368,288, filed Mar. 28, 2002, the entire disclosuresof which are expressly incorporated herein by reference for allpurposes.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH

This invention was made with government support under Grant No. NIHCA85147 and Grant No. NIH CA81126, both awarded by the NationalInstitutes of Health. The government has certain rights in theinvention.

SEQUENCE LISTING

The instant application contains a Sequence Listing which has beensubmitted via EFS-web and is hereby incorporated by reference in itsentirety. The ASCII copy is named 1-56063-D2002-07-2_SL, and is 9,283bytes in size.

BACKGROUND OF THE INVENTION

Non-small cell lung cancer (NSCLC) is the most common type ofbronchogenic carcinoma. Although chemotherapeutic regimens with greaterefficacy continue to be developed, the best regimens presently give anoverall regression rate of only 30-50%. This lack of response isattributable to resistance that is present de novo or develops inresponse to treatment. It is believed that mechanisms of chemoresistancelikely involve multiple gene products. It is important to define therole of specific genes involved in tumor development and growth and toidentify and quantify those genes and gene products that can serve astargets for diagnosis, prevention, monitoring and treatment of cancer.

In certain instances, therapeutic agents that are initially effectivebecome ineffective or less effective for a patient over time. The sametherapeutic agent can continue to be effective for a longer period oftime for a different patient. Further, the therapeutic agents can beineffective or harmful to still other patients. Therefore, it would bebeneficial to identify genes and/or gene products that could serve asmarkers with respect to cancers and to given therapeutic agents. Theability to make such predictions and corrections in the treatment makeit possible to more accurately make decisions on the therapeutic regimeat an earlier stage in time in the course of a treatment of a patient.

Currently, cisplatin and carboplatin are among the most widely usedcytotoxic anticancer drugs. However, resistance to these drugs throughde novo or induced mechanisms undermines their curative potential.Perez, R. P., Cellular and molecular determinants of cisplatinresistance, Eur. J. Cancer (1998), 34, 1535-1542. Recently,understanding regarding potential modes of chemoresistance to platinumcompounds has been obtained through studies correlating cytotoxicitywith nucleotide excision-repair (NER) (Dijt, F., Fitchinger-Schepman, A.M., Berends, F., Reedikj, J., Formation and repair of cisplatin-inducedadducts to DNA in cultured normal and repair-deficient humanfibroblasts, Cancer Res. (1988), 48, 6058-6062. Zamble, D. B., Lippard,S. J., Cisplatin and DNA repair in cancer chemotherapy, Trends BiochemSci (1995), 20, 435-439. States, J. C., Reed, E., Enhanced XPA mRNAlevels in cisplatin-resistant human ovarian cancer are not associatedwith XPA mutations or gene amplifications, Cancer Lett. (1996), 108,233-237. Ferry, K. V., Fink, D., Johnson, S. W., Hamilton, T. C.,Howell, S. B., Quantitation of platinum-DNA adduct repair in mismatchrepair deficient and proficient human colorectal cancer cell lines usingan in vitro DNA repair assay, Proc. Am. Assoc. Cancer Res. (1997),abstract, 38, 359. Jordan, P., Carmo-Fonseca, M., Molecular mechanismsinvolved in cisplatin cytotoxicity, Cell Mol. Life Sci. (2000), 57,1229-1235. Kartalou, M., Essingmann, J. M., Mechanisms of resistance tocisplatin, Mutat. Res. (2001), 478, 23-43) or drug uptake/efflux(Kartalou, M., Essingmann, J. M., Mechanisms of resistance to cisplatin,Mutat. Res. (2001), 478, 23-43. Berger, W., Elbling, L., Hauptmann, E.,Micksche, M., Expression of the multidrug resistance-associated protein(MRP) and chemoresistance of human non-small-cell lung cancer cells,Int. J. Cancer (1997), 73, 84-93. Borst, P., Kool, M., Evers, R., DocMOAT (MRP2), other MRP homologues, and LRP play a role in MDR? CancerBiol. (1997), 8, 205-213. Young, L. C., Campling, B. G.,Voskoglou-Nomikos, T., Cole, S. P. C., Deeley, R. G., Gerlach, J. H.,Expression of multidrug resistance protein-related genes in lung cancer:correlation with drug response, Clin. Cancer Res. (1999), 5, 673-480.Berger, W., Elbling, L., Micksche, M., Expression of the major vaultprotein LRP in human non-small-cell lung cancer cells: activation byshort-term exposure to antineoplastic drugs, Int. J. Cancer (2000), 88,293-300. Borst, P., Evers, R., Kool, M., Wijnholds, J., A family of drugtransporters: the multidrug resistance-associated proteins, J. Nat.Cancer Inst. (2000), 92, 1295-1302. Oguri, T., Isobe, T., Suzuki, T.,Nishio, K., Fujiwara, Y., Katoh, 0., Yamakido, M., Increased expressionof the MRP5 gene is associated with exposure to platinum drugs in lungcancer, Int. J. Cancer (2000), 86, 95-100.

Current advances in technology, including microarrays and quantitativeRT-PCR methods, are allowing classification of cancer types on the basisof functional genomics as opposed to histomorphology. Golub, T. R.,Slonim, D. K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J. P.,Coller, H., Loh, M. L., Downing, J. R., Caligiuri, M. A., Bloomfield, C.D., Lander, E. S., Molecular classification of cancer: class discoveryand class prediction by gene expression monitoring, Science (1999), 286,531-537. Alizadeh, A. A., Eisen, M. B., Davis, R. E., Ma, C., Lossos, I.S., Rosenwald, A., Boldrick, J. C., Sabet, H., Tran, T., Yu, X., Powell,J. I., Yang, L., Marti, G. E., Moore, T., Hudson, Jr., J., Lu, L.,Lewis, D. B., Tibshirani, R., Sherlock, G., Chan, W. C., Greiner, T. C.,Weisenburger, D. D., Armitage, J. O., Warnke, R., Staudt, L. M., et al.,Distinct types of diffuse large B-cell lymphoma identified by geneexpression profiling, Nature (2000), 403, 503-511. For example, they mayallow for the discovery of predictive markers based on gene expressionprofiles. Microarray screening analysis currently is being investigatedto predict chemotherapeutic sensitivity based on gene expressionprofiles. Scherf, U., Ross, D. T., Waltham, M., Smith, L. H., Lee, J.K., Tanabe, L., Kohn, K. W., Reinhold, W. C., Myers, T. G., Andrews, D.T., Scudiero, D. A., Eisen, M. B., Sausville, E. A., Pommier, Y.,Botstein, D., Brown, P. O., Weinstein, J. N., A gene expression databasefor the molecular pharmacology of cancer, Nat. Genet. (2000), 24,236-244. Kihara, C. Tsunoda, T., Tanaka, T., Yamana, H., Furukawa, Y.,Ono, K., Kitahara, 0., Zembutsu, H., Yanagawa, R., Hirata, K., Takagi,T., Nakamura, Y., Prediction of sensitivity of esophageal tumors toadjuvant chemotherapy by cDNA microarray analysis of gene-expressionprofiles, Cancer Res. (2001), 61, 6474-6479. Zembutsu, H., Ohnishi, Y.,Tsunoda, T., Furukawa, Y., Katagiri, T., Ueyama, Y., Tamaoki, N.,Nomura, T., Kitahara, 0., Yanagawa, R., Hirata, K., Nakamura, Y.,Genome-wide cDNA microarray screening to correlate gene expressionprofiles with sensitivity of 85 human cancer xenografts to anticancerdrugs, Cancer Res. (2002), 62, 518-527. An advantage of microarrayanalysis is that thousands of genes may be simultaneously evaluated.However, it is generally recognized that, due to lack ofstandardization, relatively low sensitivity and relatively poor lowerthresholds of detection, microarray assessments need to be confirmedwith follow-up quantitative methods. StaRT-PCR is a method that allowsfor rapid, reproducible, standardized, quantitative measurements formany genes simultaneously. Willey, J. C., Crawford, E. L., Jackson, C.M., Weaver, D. A., Hoban, J. C., Khuder, S. A., DeMuth, J. P.,Expression measurement of many genes simultaneously by quantitativeRT-PCR using standardized mixtures of competitive templates, Am. J.Respir. Cell Mol. Biol. (1998), 19, 6-17. Weaver, et al. Comparison ofexpression patterns by microarray and standardized RT-PCR analyses inlung cancer cell lines with varied sensitivity to carboplatin. Proc. Am.Assoc. Cancer Res. 2001 (abstract) 42, 606.

StaRT-PCR can also be used to more accurately diagnose lung cancer insmall biopsy tissues. Warner, et al. “High c-mycxE2F-1/p21 may augmentcytologic diagnosis of NSCLC” Prod. Am. Assoc. Cancer Res. Vol. 43,abstract 3738, March 2002; Weaver, et al. Gene expression modeling ofcisplatin chemoresistance in non-small cell lung cancer cell linesutilizing standardized RT StarRT-PCR″ Prod. Am. Assoc. Cancer Res. Vol.43, abstract 5471, March 2002.

SUMMARY OF THE INVENTION

The present invention identifies patterns of individual, interactivegene expression and/or indices (IGEI) comprising the expression valuesof multiple genes which, in one instance, are more effective markers ofchemoresistant non-small cell lung cancer (NSCLC) tumors than expressionvalues of individual genes, and in another instance, may be used to moreaccurately diagnose lung cancer in small biopsy tissues.

The present invention is directed to the identification and use ofmarkers that can be used to determine the sensitivity of cancer cells toa therapeutic agent. More specifically, the invention features “a numberof markers” that are variably expressed in cancer tissue and can be usedto determine the sensitivity of cancer cells to a therapeutic agent.Still more specifically, the invention features “interactive geneexpression indices” (IGEI) useful for assessment of biological samplesto prospectively identify the usefulness of therapeutic agents.

The present invention thus provides gene expression profiles which serveas useful diagnostic markers as well as markers that can be used tomonitor disease states, disease progression, drug toxicity, drugefficacy and drug metabolism.

The present invention further provides a method to determine whether anagent or combination of agents can be used to reduce the growth ofcancer cells as well as determining new agents for the treatment ofcancer Various embodiments of the present invention are directed to usesof the identified markers whose expression is correlated with accuratediagnosis of lung cancer cells or tissue compared to normal tissues, andother markers whose expression is correlated with sensitivity totreatment with a therapeutic agent. In particular, the present inventionprovides, without limitation: 1) methods for determining whether aparticular tissue is lung cancer or non cancer tissue; 2) methods formonitoring the effectiveness of therapeutic agents used for thetreatment of cancer; 3) methods for developing new therapeutic agentsfor the treatment of cancer; and 4) methods for identifying combinationsof therapeutic agents for the treatment of cancer.

By examining and quantifying the expression of one or more of theidentified markers in a sample of cancer cells, it is further possibleto determine which therapeutic agent or combination of agents will bemost likely to reduce the growth rate of the cancer and can further beused in selecting appropriate treatment agents.

By examining and quantifying the expression of one or more of theidentified markers in a sample of cancer cells, it is also possible todetermine which therapeutic agent or combination of agents will be theleast likely to reduce the growth rate of the cancer.

By examining and quantifying the expression of one or more of theidentified markers, it is also possible to eliminate inappropriatetherapeutic agents.

By examining and quantifying the expression of one or more identifiedmarkers when cancer cells or a cancer cell line is exposed to apotential anti-cancer agent, it is possible to identify the efficacy ofnew anti-cancer agents.

Further, by examining and quantifying the expression of one or more ofthe identified markers in a sample of cancer cells taken from a patientduring the course of therapeutic treatment, it is possible to determinewhether the therapeutic treatment is continuing to be effective orwhether the cancer has become resistant (refractory) to the therapeutictreatment. These determinations can be made on a patient-by-patientbasis or on an agent by agent (or combination of agents) basis. It mayalso be possible to determine whether or not a particular therapeutictreatment is likely to benefit a particular patient or group/class ofpatients, or whether a particular treatment should be continued.

The present invention further provides previously unknown orunrecognized targets for the development of anti-cancer agents, such aschemotherapeutic compounds.

The identified interactive gene expression indices (IGEI) of the presentinvention are useful as targets in developing treatments (either for asingle agent or for multiple agents) for cancer.

The present invention identifies the global changes in gene expressionassociated with lung cancer by examining gene expression in tissue fromnormal lung. The present invention also identifies expression profileswhich serve as useful diagnostic markers as well as markers that can beused to monitor disease states, disease progression, drug toxicity, drugefficacy and drug metabolism.

In some preferred embodiments, the methods, genes, and IGEI describedherein are useful to identify cisplatin resistant cancers (in contrastto diagnosing cancers from normal tissues). Such embodiments may includedetecting the expression level of one or more genes selected from agroup consisting of ERCC2, ABCC5, XPA and XRCC1.

In some preferred embodiments, the method may include detecting theexpression level of one or more genes selected from a group consistingof ERCC2/XPC, ABCC5/GTF2H2, ERCC2/GTF2H2, XPA/XPC and XRCC1/XPC.

In some preferred embodiments, the method may include detecting theexpression level of one or more genes selected from a group consistingof ABCC5/GTF2H2, and ERCC2/GTF2H2.

The invention also includes methods of detecting the progression ofNSCLC and/or differentiating small cell lung cancer (SCLC) and/ornonmetastatic from metastatic disease. For instance, methods of theinvention include detecting the progression of NSCLC in a patientcomprising the step of detecting the level of expression in a tissuesample of two or more genes from Tables 1 and/or 5; wherein differentialexpression of the genes in Tables 1 and/or 5 is indicative of NSCLCprogression. In some preferred embodiments, one or more genes may beselected from a group consisting of the genes listed in Table 5 (FIG.5).

In some aspects, the present invention provides a method of monitoringthe treatment of a patient with NSCLC, comprising administering apharmaceutical composition to the patient and preparing a geneexpression profile from a cell or tissue sample from the patient andcomparing the patient gene expression profile to a gene expression froma cell population comprising normal lung cells or to a gene expressionprofile from a cell population comprising lung cancer cells or to both.In some preferred embodiments, the gene profile will include theexpression level of one or more genes in Tables 1 and 5 (FIGS. 1 and 5,respectively). In other preferred embodiments, one or more genes may beselected from a group consisting of the genes listed in Table 5 (FIG.5).

In another aspect, the present invention provides a method of treating apatient with NSCLC, comprising administering to the patient apharmaceutical composition, wherein the composition alters theexpression of at least one gene in Tables 1 and 5 (FIGS. 1 and 5,respectively), preparing a gene expression profile from a cell or tissuesample from the patient comprising tumor cells and comparing the patientexpression profile to a gene expression profile from an untreated cellpopulation comprising NSCLC cells. In some preferred embodiments, one ormore genes may be selected from a group consisting of the genes listedin Table 5 (FIG. 5).

The invention includes methods of diagnosing the presence or absence oflung cancer in a patient comprising the step of detecting the level ofexpression in a tissue sample of an IGEI comprising c-mycxE2F-1/p21(Sequence ID Nos. 40-48 since each gene has 3 primer sequences) in whichthe c-myc gene expression value (molecules/10⁶ β-actin molecules) ismultiplied times the E2F-1 expression value and this product is dividedby the p21 gene expression value.

The c-mycxE2F-1/p21 index may also be used as a marker for themonitoring of disease progression, for instance, the development of lungcancer. For instance, a lung tissue sample or other sample from apatient may be assayed by any of the methods described herein, and theexpression levels in the sample of c-mycxE2F-1/p21 may be compared tothe expression levels found in normal lung tissue, tissue from SCLC,metastatic lung cancer or NSCLC tissue. Comparison of the expressiondata, as well as available sequence or other information may be done byresearcher or diagnostician or may be done with the aid of a computerand databases as described herein.

The invention further includes methods of screening for an agent capableof modulating the onset or progression of NSCLC, comprising the steps ofexposing a cell to the agent; and detecting the expression level of thec-mycxE2F-1/p21 index.

According to one aspect of the present invention, the genes identifiedin Tables 1 and (FIGS. 1 and 5, respectively) may be used as markers toevaluate the effects of a candidate drug or agent on a cell or tissuesample, for instance, a lung cancer cell or tissue sample. A candidatedrug or agent can be screened for the ability to simulate thetranscription or expression of a given marker or set of marker genes(drug targets) or to down-regulate or counteract the transcription orexpression of a marker or markers. According to the present invention,one can also compare the specificity of drugs' effects on geneexpression markers and comparing them. More specific drugs may havefewer transcriptional targets. Similar sets of markers identified fortwo drugs indicate a similarity of effects.

Any of the methods of the invention described above may include thedetection and quantification of at least 2 genes from the Tables 1and/or 5 or c-mycxE2F-1/p21. Preferred methods may detect and quantifyall or nearly all of the genes in the tables. In some preferredembodiments, one or more genes may be selected from a group consistingof the genes listed in Table 5 (FIG. 5).

According to another aspect, the present invention relates to a methodof diagnosing non small cell lung cancer in a patient, comprising: (a)detecting and quantifying the level of expression in a tissue sample ofc-myc, E2F-1 and p21 genes; wherein differential expression of thec-myc, E2F-1 and p21 genes is indicative of non small cell lung cancer.

In another aspect, the present invention relates to a method ofdetecting the progression of non small cell lung cancer in a patient,comprising: (a) detecting and quantifying the level of expression in atissue sample of c-myc, E2F-1 and p21 genes; wherein differentialexpression of the c-myc, E2F-1 and p21 genes is indicative of non smallcell lung cancer progression.

In still other aspects, the present invention relates to a method ofmonitoring the treatment of a patient with non small cell lung cancer,comprising: (a) administering a pharmaceutical composition to thepatient; (b) preparing a gene expression profile from a cell or tissuesample from the patient; and (c) comparing the patient gene expressionprofile to a gene expression from a cell population selected from thegroup consisting of normal lung cells, and non small cell lung cancer.

In still more aspects, the present invention relates to a method oftreating a patient with non small cell lung cancer, comprising: (a)administering to the patient a pharmaceutical composition, wherein thecomposition alters the expression of at least one gene in Tables 1 and 5(FIGS. 1 and 5, respectively) or c-myc, E2F-1 and p21 genes; (b)preparing an IGEI comprising standardized gene expression values usingStaRT-PCR from a cell or tissue sample comprising tumor cells obtainedbefore treatment and another sample obtained after treatment; and (c)comparing the sample obtained prior to treatment with the sampleobtained after treatment.

Yet other aspects of the present invention relate to a method ofscreening for an agent capable of modulating the onset or progression ofnon small cell lung cancer, comprising: (a) preparing a first IGEIcomprising standardized gene expression values using StaRT-PCR of a cellpopulation comprising non small cell cancer cells, wherein the firstIGEI determines the expression level of one or more genes from Tables 1and 5 (FIGS. 1 and 5, respectively) or c-myc, E2F-2 and p21 genes; (b)exposing the cell population to the agent; (c) preparing second IGEIcomprising standardized gene expression values using StaRT-PCR of theagent-exposed cell population; and (d) comparing the first and secondIGEIs.

In another aspect, the present invention relates to one or more solidphase hybridization templates for measuring, in a standardized fashion,PCR products following standardized quantitative RT-PCR where thetemplate is formed as follows:

a) preparing at least one solid phase hybridization template where, foreach gene, an oligonucleotide of any length that will bind withspecificity to both the competitive template, CT, and native template,NT, is spotted to a filter;

b) identifying a suitable oligonucleotide such that the region betweenthe forward primer (common to both the NT and CT) and the 3′ 20 bp ofthe reverse CT primer is evaluated;

c) attaching an oligonucleotide to a solid support at a previouslydesignated location;

d) amplifying the CT and NT PCR products and hybridizing to the spots ofthe filter wherein each gene (NT and CT) are amplified separately;

e) pooling the PCR products for hybridization; and

f) preparing two oligonucleotide probes, each labeled with a differentfluor, for each gene wherein one oligonucleotide is homologous to, andwill bind to sequences unique to the NT for a gene that wasPCR-amplified such that this oligonucleotide binds to the region of theNT that is not homologous to the CT and is labeled with a differentfluor, and wherein the other oligonucleotide is specific to the CT andis labeled with a different fluor such that this other oligonucleotideis homologous to and will bind to CT sequences that span the 3′ end ofthe reverse primer. In certain embodiments, the NT-specific andCT-specific oligonucleotides for multiple genes are mixed in equalamounts and hybridized to the gene-specific PCR products bound to thegene-specific oligonucleotides spotted on the filter. Also, the ratiobetween the fluors bound to the spot quantify the NT relative to CT.Although there may be different binding affinities between the CT and CTprobe relative to that between the NT and NT probe, this difference isconsistent between different samples assessed, and from one experimentto another. It should be noted that the template can comprises at leastone standardized microarray, microbeads, glass slides, or chips preparedby photolithography, and that the solid support can be a membrane, aglass support, a filter, a tissue culture dish, a polymeric material, abead and a silica support. In certain embodiments, the solid supportcomprising at least two oligonucleotides, wherein each of theoligonucleotides comprises a sequence that specifically hybridizes to atleast one gene in Tables 1 and 5 (FIGS. 1 and 5, respectively) or thec-myc, E2F-1 and p21 genes. It should also be noted that theoligonucleotides can be covalently attached to the solid support, oralternatively can be non-covalently attached to the solid supportexpression level in units of molecules/10⁶ β-actin molecules for the setof genes in normal lung tissue.

The invention further includes computer systems comprising a numericalstandardized database containing information identifying the expressionlevel in lung tissue of a set of genes comprising at least two genes inTables 1 and 5 (FIGS. 1 and 5, respectively) or c-mycxE2F-1/p21; and auser interface to view the information. In some preferred embodiments,one or more genes may be selected from a group consisting of the geneslisted in Table 5 (FIG. 5). The numerical standardized database mayfurther include sequence information for the genes, informationidentifying the expression level for the set of genes in normal lungtissue and malignant tissue (metastatic and nonmetastatic) and maycontain links to external databases such as GenBank.

The invention further comprises kits useful for the practice of one ormore of the methods of the invention. In some preferred embodiments, akit may contain one or more solid supports having attached thereto oneor more oligonucleotides. The solid support may be a high-densityoligonucleotide array. Kits may further comprise one or more reagentsfor use with the arrays, one or more signal detection and/orarray-processing instruments, one or more gene expression databases andone or more analysis and database management software packages. Thekits, in certain preferred embodiments, have StaRT-PCR reagents withreagents to apply to standardized microarrays.

The invention still further includes methods of using the databases,such as methods of using the disclosed computer systems to presentinformation identifying the expression level in a tissue or cell of atleast one gene in Tables 1 and 5 (FIGS. 1 and 5, respectively),comprising the step of comparing the expression level of at least onegene in Tables 1 and 5 (FIGS. 1 and 5, respectively) in the tissue orcell to the level of expression of the gene in the database. In somepreferred embodiments, one or more genes may be selected from a groupconsisting of the genes listed in Table 5 (FIG. 5).

Other features and advantages of the invention will be apparent from thedetailed description and from the claims. Although materials and methodssimilar or equivalent to those described herein can be used in thepractice or testing of the invention, the preferred materials andmethods are described below.

BRIEF DESCRIPTION OF THE DRAWINGS

FIGS. 1A-1B: Table 1 showing primers used for PCR amplificationincluding the gene designation, GenBank accession number, Sequence IDnumber, primer, sequence, by position in cDNA, and product length (bp).

FIG. 2: Table 2 shows the IC 50 for NSCLC cell lines and the cisplatinlevels.

FIGS. 3A-3B: Table 3 shows the gene expression in NSCLC cell lines(mRNAs/10⁶ ACTB mRNAs).

FIG. 4: Table 4 shows the correlation of gene expression with cisplatinchemoresistance in NSCLC cell lines.

FIG. 5: Table 5 shows the statistical assessments of cisplatinchemoresistance models in NSCLC cell lines.

FIG. 6: Table 6 shows the effect of collection methods on RNA quality inH1155 human NSCLC cells in Example II which relates to IEGI used forFine Needle Analysis (FNA) for lung cancer diagnosis.

FIG. 7: Table 7 showing cytological information and diagnosis of FNAspecimen cells in Example.

FIG. 8: Table 8 shows gene expression value and index values for c-myc,E2F-1 and p21 in FNA samples.

FIGS. 9A-9B: Schematic illustrations of an analysis of standardizedRT-PCR products with microarrays and microbeads: FIG. 9A showsmicroarrays where the identity of the gene is known by the location ofthe microarray; and FIG. 9B shows microbeads where the identity of thegene is known by the fluorescent color of the bead.

DETAILED DESCRIPTION

Throughout this disclosure, various publications, patents and publishedpatent specifications are referenced. The disclosures of thesepublications, patents and published patent specifications are herebyincorporated by reference into the present disclosure to more fullydescribe the state of the art to which this invention pertains.

The present invention is based, in part, on the identification andquantification of markers that can be used to determine whether cancercells are sensitive to a therapeutic agent. Based on theseidentifications and quantifications, the present invention provides,without limitation: 1) methods for determining whether a therapeuticagent (or combination of agents) will or will not be effective instopping or slowing tumor growth; 2) methods for monitoring theeffectiveness of a therapeutic agent (or combination of agents) used forthe treatment of cancer; 3) methods for identifying new therapeuticagents for the treatment of cancer; 4) methods for identifyingcombinations of therapeutic agents for use in treating cancer; 5)methods for identifying specific therapeutic agents and combinations oftherapeutic agents that are effective for the treatment of cancer inspecific patients; and methods for diagnosing cancer.

DEFINITIONS

Unless otherwise defined, all technical and scientific terms used hereinhave the same meaning as commonly understood by one of ordinary skill inthe art to which this invention belongs. Although methods and materialssimilar or equivalent to those described herein can be used in thepractice or testing of the present invention, the preferred methods andmaterials are described herein. All publications, patent applications,patents, and other references mentioned herein are incorporated byreference in their entirety. The content of all GenBank, and otherdatabase records such as IMAGE Consortium, and Unigene database recordscited throughout this application (including the Tables) are also herebyincorporated by reference. In the case of conflict, the presentspecification, including definitions, will control. In addition, thematerials, methods, and examples are illustrative only and are notintended to be limiting.

The articles “a” and “an” are used herein to refer to one or to morethan one (i.e. to at least one) of the grammatical object of thearticle. By way of example, “an element” means one element or more thanone element.

A “marker” is a naturally occurring polymer corresponding to at leastone of the nucleic acids listed in Tables 1-5 (FIGS. 1-5). For example,markers include, without limitation, sense and anti-sense strands ofgenomic DNA (i.e. including any introns occurring therein), RNAgenerated by transcription of genomic DNA (i.e. prior to splicing), RNAgenerated by splicing of RNA transcribed from genomic DNA, and proteinsgenerated by translation of spliced RNA (i.e. including proteins bothbefore and after cleavage of normally cleaved regions such astransmembrane signal sequences). As used herein, “marker” may alsoinclude a cDNA made by reverse transcription of an RNA generated bytranscription of genomic DNA (including spliced RNA).

The term “probe” refers to any molecule which is capable of selectivelybinding to a specifically intended target molecule, for example a markerof the invention. Probes can be either synthesized by one skilled in theart, or derived from appropriate biological preparations. For purposesof detection of the target molecule, probes may be specifically designedto be labeled, as described herein. Examples of molecules that can beutilized as probes include, but are not limited to, RNA, DNA, proteins,antibodies, and organic monomers.

The “normal” level of expression of a marker is the level of expressionof the marker in cells of a patient not afflicted with cancer.

As used herein, the term “promoter/regulatory sequence” means a nucleicacid sequence which is required for expression of a gene productoperably linked to the promoter/regulatory sequence. In some instances,this sequence may be the core promoter sequence and in other instances,this sequence may also include an enhancer sequence and other regulatoryelements which are required for expression of the gene product. Thepromoter/regulatory sequence may, for example, be one which expressesthe gene product in a tissue-specific manner.

A “constitutive” promoter is a nucleotide sequence which, when operablylinked with a polynucleotide which encodes or specifies a gene product,causes the gene product to be produced in a living human cell under mostor all physiological conditions of the cell.

A “transcribed polynucleotide” is a polynucleotide (e.g. an RNA, a cDNA,or an analog of one of an RNA or cDNA) which is complementary to orhomologous with all or a portion of a mature RNA made by transcriptionof a genomic DNA corresponding to a marker of the invention and normalpost-transcriptional processing (e.g. splicing), if any, of thetranscript.

“Complementary” refers to the broad concept of sequence complementaritybetween regions of two nucleic acid strands or between two regions ofthe same nucleic acid strand. It is known that an adenine residue of afirst nucleic acid region is capable of forming specific hydrogen bonds(“base pairing”) with a residue of a second nucleic acid region which isantiparallel to the first region if the residue is thymine or uracil.Similarly, it is known that a cytosine residue of a first nucleic acidstrand is capable of base pairing with a residue of a second nucleicacid strand which is antiparallel to the first strand if the residue isguanine. A first region of a nucleic acid is complementary to a secondregion of the same or a different nucleic acid if, when the two regionsare arranged in an antiparallel fashion, at least one nucleotide residueof the first region is capable of base pairing with a residue of thesecond region. Preferably, the first region comprises a first portionand the second region comprises a second portion, whereby, when thefirst and second portions are arranged in an antiparallel fashion, atleast about 50%, and preferably at least about 75%, at least about 90%,or at least about 95% of the nucleotide residues of the first portionare capable of base pairing with nucleotide residues in the secondportion. More preferably, all nucleotide residues of the first portionare capable of base pairing with nucleotide residues in the secondportion.

“Homologous” as used herein, refers to nucleotide sequence similaritybetween two regions of the same nucleic acid strand or between regionsof two different nucleic acid strands. When a nucleotide residueposition in both regions is occupied by the same nucleotide residue,then the regions are homologous at that position. A first region ishomologous to a second region if at least one nucleotide residueposition of each region is occupied by the same residue. Homologybetween two regions is expressed in terms of the proportion ofnucleotide residue positions of the two regions that are occupied by thesame nucleotide residue.

Preferably, the first region comprises a first portion and the secondregion comprises a second portion, whereby, at least about 50%, andpreferably at least about 75%, at least about 90%, or at least about 95%of the nucleotide residue positions of each of the portions are occupiedby the same nucleotide residue. More preferably, all nucleotide residuepositions of each of the portions are occupied by the same nucleotideresidue.

A marker is “fixed” to a substrate if it is covalently or non-covalentlyassociated with the substrate such the substrate can be rinsed with afluid (e.g. standard saline citrate, pH 7.4) without a substantialfraction of the marker dissociating from the substrate.

As used herein, a “naturally-occurring” nucleic acid molecule refers toan RNA or DNA molecule having a nucleotide sequence that occurs innature (e.g. encodes a natural protein).

Cancer is “inhibited” if at least one symptom of the cancer isalleviated, terminated, slowed, or prevented. As used herein, cancer isalso “inhibited” if recurrence or metastasis of the cancer is reduced,slowed, delayed, or prevented. Cancer is also inhibited or the cellproliferation decreases or the cell death rate increases

A “kit” is any manufacture (e.g. a package or container) comprising atleast one reagent, e.g. a probe, for specifically detecting a marker ofthe invention, the manufacture being promoted, distributed, or sold as aunit for performing the methods of the present invention.

SPECIFIC EMBODIMENTS

The examples provided below concern the identification andquantification of markers that distinguish in cancer cell lines that aresensitive to defined chemotherapeutic agents, namely platinum compoundsfrom those that are not responsive. Accordingly, one or more of themarkers can be used to identify cancer cells that can be successfullytreated by that agent. A change in the expression in one or more of themarkers can also be used to identify cancer cells that cannot besuccessfully treated by that agent. These markers can therefore be usedin methods for identifying cancers that have become or are at risk ofbecoming refractory to treatment with the agent.

The expression level of the identified markers may be used to: 1)determine if a cancer can be treated by an agent or combination ofagents; 2) determine if a cancer is responding to treatment with anagent or combination of agents; 3) select an appropriate agent orcombination of agents for treating a cancer; 4) monitor theeffectiveness of an ongoing treatment; and 5) identify new cancertreatments (either single agent or combination of agents).

In particular, the identified markers may be utilized to determineappropriate therapy, to monitor clinical therapy and human trials of adrug being tested for efficacy, and to develop new agents andtherapeutic combinations.

Accordingly, the present invention provides methods for determiningwhether an agent can be used to inhibit cancer cells, comprising thesteps of:

a) obtaining a sample of cancer cells;

b) determining and quantifying the level of expression in the cancercells of a marker identified in Tables 1 and 5 (FIGS. 1 and 5,respectively); and

c) identifying that an agent can be used to inhibit the cancer cellswhen the marker is expressed at a certain level.

The present invention also provides methods for determining whether anagent is effective in treating cancer, comprising the steps of:

a) obtaining a sample of cancer cells;

b) exposing the sample to an agent;

c) determining and quantifying the level of expression of a markeridentified in Tables 1 and 5 (FIGS. 1 and 5, respectively) in the sampleexposed to the agent and in a sample that is not exposed to the agent;and

d) identifying that an agent is effective in treating cancer whenexpression of the marker is altered in the presence of the agent.

The present invention further provides methods for determining whethertreatment with an agent should be continued in a cancer patient,comprising the steps of:

a) obtaining two or more samples comprising cancer cells from a patientduring the course of treatment with the agent;

b) determining and quantifying the level of expression of a markeridentified in Tables 1 and 5 (FIGS. 1 and 5, respectively) in the two ormore samples; and

c) continuing treatment when the expression level of the marker is at acertain level, e.g., not significantly altered during the course oftreatment.

The present invention also provides methods of identifying new cancertreatments, comprising the steps of:

a) obtaining a sample of cancer cells;

b) determining and quantifying the level of expression of a markeridentified in Tables 1 and 5 (FIGS. 1 and 5, respectively);

c) exposing the sample to the cancer treatment;

d) determining the level of expression of the marker in the sampleexposed to the cancer treatment; and

e) identifying that the cancer treatment is effective in treating cancerwhen the marker is expressed at a certain level.

Accordingly, in another aspect, the present invention provides methodsfor diagnosing cancer, comprising the steps of:

a) obtaining a sample of tissue that might contain cancer cells; and

b) determining and quantifying the level of expression in the tissue thec-mcyxE2F-1/p21 index.

As used herein, an agent is said to reduce the rate of growth of cancercells when the agent can reduce at least 50%, preferably at least 75%,most preferably at least 95% of the growth of the cancer cells. Suchinhibition can further include a reduction in survivability and anincrease in the rate of death of the cancer cells. The amount of agentused for this determination will vary based on the agent selected.Typically, the amount will be a predefined therapeutic amount.

As used herein, the term “agent” is defined broadly as anything thatcancer cells may be exposed to in a therapeutic protocol. In the contextof the present invention, such agents include, but are not limited to,chemotherapeutic agents, such as anti-metabolic agents, e.g.,cross-linking agents, e.g., cisplatin and CBDCA, radiation andultraviolet light.

Further to the above, the language “chemotherapeutic agent” is intendedto include chemical reagents which inhibit the growth of proliferatingcells or tissues wherein the growth of such cells or tissues isundesirable.

The agents tested in the present methods can be a single agent or acombination of agents. For example, the present methods can be used todetermine whether a single chemotherapeutic agent, such as cisplatin,can be used to treat a cancer or whether a combination of two or moreagents can be used. Preferred combinations will include agents that havedifferent mechanisms of action, e.g., the use of an anti-mitotic agentin combination with an alkylating agent.

As used herein, cancer cells refer to cells that divide at an abnormal(increased) rate. In particular, the cancer cells include, but are notlimited to, non-small cell lung cancer (NSCLC). The source of the cancercells used in the present method will be based on how the method of thepresent invention is being used. For example, if the method is beingused to determine whether a patient's cancer can be treated with anagent, or a combination of agents, then the preferred source of cancercells will be cancer cells obtained from a cancer biopsy from thepatient. Alternatively, a cancer cell line similar to the type of cancerbeing treated can be assayed. For example if non-small cell lung cancer(NSCLC) is being treated, then a (NSCLC) cell line can be used. If themethod is being used to monitor the effectiveness of a therapeuticprotocol, then a tissue sample from the patient being treated is thepreferred source. If the method is being used to identify newtherapeutic agents or combinations, any cancer cells, e.g., cells of acancer cell line, can be used.

A skilled artisan can readily select and obtain the appropriate cancercells that are used in the present method. For cancer cell lines,sources such as The National Cancer Institute, for the NCI cells used inthe examples, are preferred. For cancer cells obtained from a patient,standard biopsy methods, such as a needle biopsy, can be employed,taking necessary precautions known in the art to preserve mRNAintegrity.

In the methods of the present invention, the level or amount ofexpression of one or more markers selected from the group consisting ofthe markers identified in Table 1 (FIG. 1) is determined. As usedherein, the level or amount of expression refers to the level ofexpression of an mRNA encoded by the gene or the level of expression ofthe protein encoded by the gene (i.e., whether or not expression is oris not occurring in the cancer cells). It also may refer to the valuesof the interactive gene expression indices (IGEI) disclosed herein. Askilled artisan can readily adapt known mRNA detection methods for usein detecting the level of mRNA encoded by one or more of the (IGEI)marker sets of the present invention.

Proteins from cancer cells can be isolated using techniques that arewell known to those of skill in the art. The protein isolation methodsemployed can, for example, be such as those described in Harlow and Lane(Harlow and Lane, 1988, Antibodies: A Laboratory Manual, Cold SpringHarbor Laboratory Press, Cold Spring Harbor, N.Y.).

A variety of formats can be employed to determine whether a samplecontains a protein that binds to a given antibody. Examples of suchformats include, but are not limited to, enzyme immunoassay (EIA),radioimmunoassay (RIA), Western blot analysis and enzyme linkedimmunoabsorbant assay (ELISA). A skilled artisan can readily adapt knownprotein/antibody detection methods for use in determining whether cancercells expresses a protein encoded by one or more of the (IGEI) markersets of the present invention.

In one format, antibodies, or antibody fragments, can be used in methodssuch as Western blots or immunofluorescence techniques to detect theexpressed proteins. In such uses, it is generally preferable toimmobilize either the antibody or protein on a solid support. Suitablesolid phase supports or carriers include any support capable of bindingan antigen or an antibody. Well-known supports or carriers includeglass, polystyrene, polypropylene, polyethylene, dextran, nylon,amylases, natural and modified celluloses, polyacrylamides, gabbros, andmagnetite. In addition, the solid support can be selected from amembrane, a glass support, a filter, a tissue culture dish, a polymericmaterial, a bead and a silica support.

In certain embodiments, the solid support comprising at least twooligonucleotides, wherein each of the oligonucleotides comprises asequence that specifically hybridizes to at least one gene in Tables 1and 5 (FIGS. 1 and 5, respectively). Also, the solid support can includeoligonucleotides that are covalently attached to the solid support, oralternatively, are non-covalently attached to the solid support.

One skilled in the art will know many other suitable carriers forbinding antibody or antigen, and will be able to adapt such support foruse with the present invention. For example, protein isolated fromcancer cells can be run on a polyacrylamide gel electrophoresis andimmobilized onto a solid phase support such as nitrocellulose. Thesupport can then be washed with suitable buffers followed by treatmentwith the detectably labeled marker product specific antibody. The solidphase support can then be washed with the buffer a second time to removeunbound antibody. The amount of bound label on the solid support canthen be detected by conventional means.

Another embodiment of the present invention includes a step of detectingwhether an agent stimulates the expression of one or more of the (IGEI)marker sets of the present invention. Although some of the present(IGEI) marker sets can be expressed in non-treated cancer cells,treatment with an agent may, or may not, alter expression. Alterationsin the expression level of the (IGEI) marker sets of the presentinvention can provide a further indication as to whether an agent willor will not be effective at reducing the growth rate of the cancercells.

In such a use, the present invention provides methods for determiningwhether an agent, e.g., a chemotherapeutic agent, can be used to inhibitcancer cells comprising the steps of:

a) obtaining a sample of cancer cells;

b) exposing the sample of cancer cells to one or more test agents;

c) determining and quantifying the level of expression in the cancercells of one or more markers selected from the group consisting of themarkers identified in Tablet (FIG. 1) in the sample exposed to the agentand in a sample of cancer cells that is not exposed to the agent; and

d) identifying that an agent can be used to treat the cancer when theexpression of one or more of the markers is increased in the presence ofsaid agent and/or when the expression of one or more of the markers isnot increased in the presence of said agent.

This embodiment of the methods of the present invention involves thestep of exposing the cancer cells to an agent. The method used forexposing the cancer cells to the agent will be based primarily on thesource and nature of the cancer cells and the agent being tested. Thecontacting can be performed in vitro or in vivo, in a patient beingtreated/evaluated or in animal model of a cancer. For cancer cells andcell lines and chemical compounds, exposing the cancer cells involvescontacting the cancer cells with the compound, such as in tissue culturemedia. A skilled artisan can readily adapt an appropriate procedure forcontacting cancer cells with any particular agent or combination ofagents.

As discussed above, the identified (IGEI) marker sets can also be usedto assess whether a tumor has become refractory to an ongoing treatment(e.g., a chemotherapeutic treatment). When a tumor is no longerresponding to a treatment the expression profile of the tumor cells willchange: the level of expression of one or more of the markers will bereduced and/or the level of expression of one or more of the markerswill increase.

In such a use, the invention provides methods for determining whether ananti-cancer treatment should be continued in a cancer patient,comprising the steps of:

a) obtaining two or more samples of cancer cells from a patientundergoing anti-cancer therapy;

b) determining and quantifying the level of expression of one or moremarkers selected from the group and one or more of the corresponding(IGEI) marker sets in the sample exposed to the agent and in a sample ofcancer cells that is not exposed to the agent; and

c) discontinuing treatment when the expression of one or more (IGEI)marker sets is altered.

As used herein, a patient refers to any subject undergoing treatment forcancer. The preferred subject will be a human patient undergoingchemotherapy treatment.

This embodiment of the present invention relies on comparing two or moresamples obtained from a patient undergoing anti-cancer treatment. Ingeneral, it is preferable to obtain a first sample from the patientprior to beginning therapy and one or more samples during treatment. Insuch a use, a baseline of expression prior to therapy is determined andthen changes in the baseline state of expression are monitored duringthe course of therapy. Alternatively, two or more successive samplesobtained during treatment can be used without the need of apre-treatment baseline sample. In such a use, the first sample obtainedfrom the subject is used as a baseline for determining whether theexpression of a particular marker is increasing or decreasing.

In general, when monitoring the effectiveness of a therapeutictreatment, two or more samples from the patient are examined.Preferably, three or more successively obtained samples are used,including at least one pretreatment sample.

The present invention further provides kits comprising compartmentalizedcontainers comprising reagents for detecting one or more, preferably twoor more, of the markers and/or (IGEI) marker sets of the presentinvention. As used herein a kit is defined as a pre-packaged set ofcontainers into which reagents are placed. The reagents included in thekit comprise probes/primers and/or antibodies for use in detecting(IGEI) marker sets expression. In addition, the kits of the presentinvention may preferably contain instructions which describe a suitabledetection assay. Such kits can be conveniently used, e.g., in clinicalsettings, to diagnose patients exhibiting symptoms of cancer.

Various aspects of the invention are described in further detail in thefollowing subsections.

Nucleic Acid Samples

It is apparent to one of ordinary skill in the art, nucleic acid samplesused in the methods and assays of the invention may be prepared by anyavailable method or process. Methods of isolating total mRNA are alsowell known to those of skill in the art. Such samples include RNAsamples, but also include cDNA synthesized from an mRNA sample isolatedfrom a cell or tissue of interest. Such samples also include DNAamplified from the cDNA, and an RNA transcribed from the amplified DNA.One of skill in the art would appreciate that it is desirable to inhibitor destroy RNase present in homogenates before homogenates can be used.

Biological samples may be of any biological tissue or fluid or cellsfrom any organism as well as cells raised in vitro, such as cell linesand tissue culture cells. Frequently the sample will be a “clinicalsample” which is a sample derived from a patient. Typical clinicalsamples include, but are not limited to, sputum, blood, blood-cells(e.g., white cells), tissue or fine needle biopsy samples, urine,peritoneal fluid, and pleural fluid, or cells therefrom. Biologicalsamples may also include sections of tissues, such as frozen sections orformalin fixed sections taken for histological purposes.

Thus, one aspect of the invention pertains to isolated nucleic acidmolecules that correspond to a marker of the invention, includingnucleic acids which encode a polypeptide corresponding to a marker ofthe invention or a portion of such a polypeptide. Isolated nucleic acidsof the invention also include nucleic acid molecules sufficient for useas hybridization probes to identify nucleic acid molecules thatcorrespond to a marker of the invention, including nucleic acids whichencode a polypeptide corresponding to a marker of the invention, andfragments of such nucleic acid molecules, e.g., those suitable for useas PCR primers for the amplification or mutation of nucleic acidmolecules. As used herein, the term “nucleic acid molecule” is intendedto include DNA molecules (e.g., cDNA or genomic DNA) and RNA molecules(e.g., mRNA) and analogs of the DNA or RNA generated using nucleotideanalogs. The nucleic acid molecule can be single-stranded ordouble-stranded, but preferably is double-stranded DNA.

An “isolated” nucleic acid molecule is one which is separated from othernucleic acid molecules which are present in the natural source of thenucleic acid molecule. Preferably, an “isolated” nucleic acid moleculeis free of sequences (preferably protein-encoding sequences) whichnaturally flank the nucleic acid (i.e., sequences located at the 5′ and3′ ends of the nucleic acid) in the genomic DNA of the organism fromwhich the nucleic acid is derived.

A nucleic acid molecule of the present invention, e.g., a nucleic acidencoding a protein corresponding to a marker listed in Table 1 (FIG. 1),can be isolated using standard molecular biology techniques and thesequence information in the database records described herein. Using allor a portion of such nucleic acid sequences, nucleic acid molecules ofthe invention can be isolated using standard hybridization and cloningtechniques (e.g., as described in Sambrook et al., ed., MolecularCloning: A Laboratory Manual, 2nd ed., Cold Spring Harbor LaboratoryPress, Cold Spring Harbor, N.Y., 1989).

A nucleic acid molecule of the invention can be amplified using cDNA,mRNA, or genomic DNA as a template and appropriate oligonucleotideprimers according to standard PCR amplification techniques. The nucleicacid so amplified can be cloned into an appropriate vector andcharacterized by DNA sequence analysis. Furthermore, oligonucleotidescorresponding to all or a portion of a nucleic acid molecule of theinvention can be prepared by standard synthetic techniques, e.g., usingan automated DNA synthesizer.

In another preferred embodiment, an isolated nucleic acid molecule ofthe invention comprises a nucleic acid molecule which has a nucleotidesequence complementary to the nucleotide sequence of a nucleic acidcorresponding to a marker of the invention or to the nucleotide sequenceof a nucleic acid encoding a protein which corresponds to a marker ofthe invention. A nucleic acid molecule which is complementary to a givennucleotide sequence is one which is sufficiently complementary to thegiven nucleotide sequence that it can hybridize to the given nucleotidesequence thereby forming a stable duplex.

Moreover, a nucleic acid molecule of the invention can comprise only aportion of a nucleic acid sequence, wherein the full length nucleic acidsequence comprises a marker of the invention or which encodes apolypeptide corresponding to a marker of the invention. Such nucleicacids can be used, for example, as a probe or primer. The probe/primertypically is used as one or more substantially purifiedoligonucleotides. The oligonucleotide typically comprises a region ofnucleotide sequence that hybridizes under stringent conditions to atleast about 7, preferably about 12 or more consecutive nucleotides of anucleic acid of the invention.

Probes based on the sequence of a nucleic acid molecule of the inventioncan be used to detect transcripts or genomic sequences corresponding toone or more markers of the invention. The probe comprises a label groupattached thereto, e.g., a radioisotope, a fluorescent compound, anenzyme, or an enzyme co-factor. Such probes can be used as part of adiagnostic test kit for identifying cells or tissues which mis-expressthe protein, such as by measuring levels of a nucleic acid moleculeencoding the protein in a sample of cells from a subject, e.g.,detecting mRNA levels or determining whether a gene encoding the proteinhas been mutated or deleted.

The invention further encompasses nucleic acid molecules that differ,due to degeneracy of the genetic code, from the nucleotide sequence ofnucleic acids encoding a protein which corresponds to a marker of theinvention, and thus encode the same protein.

In addition to the nucleotide sequences described in the GenBankdatabase records described herein, it will be appreciated by thoseskilled in the art that DNA sequence polymorphisms that lead to changesin the amino acid sequence can exist within a population (e.g., thehuman population). Such genetic polymorphisms can exist amongindividuals within a population due to natural allelic variation. Anallele is one of a group of genes which occur alternatively at a givengenetic locus. In addition, it will be appreciated that DNApolymorphisms that affect RNA expression levels can also exist that mayaffect the overall expression level of that gene (e.g., by affectingregulation or degradation).

As used herein, the phrase “allelic variant” refers to a nucleotidesequence which occurs at a given locus or to a polypeptide encoded bythe nucleotide sequence.

As used herein, the terms “gene” and “recombinant gene” refer to nucleicacid molecules comprising an open reading frame encoding a polypeptidecorresponding to a marker of the invention. Such natural allelicvariations can typically result in 1-5% variance in the nucleotidesequence of a given gene. Alternative alleles can be identified bysequencing the gene of interest in a number of different individuals.This can be readily carried out by using hybridization probes toidentify the same genetic locus in a variety of individuals. Any and allsuch nucleotide variations and resulting amino acid polymorphisms orvariations that are the result of natural allelic variation and that donot alter the functional activity are intended to be within the scope ofthe invention.

In another embodiment, an isolated nucleic acid molecule of theinvention is at least 7, 15, 20, 25, 30, 40, 60, 80, 100, 150, 200, 250,300, 350, 400, 450, 550, 650, 700, 800, 900, 1000, 1200, 1400, 1600,1800, 2000, 2200, 2400, 2600, 2800, 3000, 3500, 4000, 4500, or morenucleotides in length and hybridizes under stringent conditions to anucleic acid corresponding to a marker of the invention or to a nucleicacid encoding a protein corresponding to a marker of the invention. Asused herein, the term “hybridizes under stringent conditions” isintended to describe conditions for hybridization and washing underwhich nucleotide sequences at least 60% (65%, 70%, preferably 75%)identical to each other typically remain hybridized to each other. Suchstringent conditions are known to those skilled in the art and can befound in sections 6.3.1-6.3.6 of Current Protocols in Molecular Biology,John Wiley & Sons, N.Y. (1989). A preferred, non-limiting example ofstringent hybridization conditions are hybridization in 6.times. sodiumchloride/sodium citrate (SSC) at about 45° C., followed by one or morewashes in 0.2.times. SSC, 0.1% SDS at 50-65° C.

In addition to naturally-occurring allelic variants of a nucleic acidmolecule of the invention that can exist in the population, the skilledartisan will further appreciate that sequence changes can be introducedby mutation thereby leading to changes in the amino acid sequence of theencoded protein, without altering the biological activity of the proteinencoded thereby. For example, one can make nucleotide substitutionsleading to amino acid substitutions at “non-essential” amino acidresidues. A “non-essential” amino acid residue is a residue that can bealtered from the wild-type sequence without altering the biologicalactivity, whereas an “essential” amino acid residue is required forbiological activity. For example, amino acid residues that are notconserved or only semi-conserved among homologs of various species maybe non-essential for activity and thus would be likely targets foralteration. Alternatively, amino acid residues that are conserved amongthe homologs of various species (e.g., murine and human) may beessential for activity and thus would not be likely targets foralteration.

Accordingly, another aspect of the invention pertains to nucleic acidmolecules encoding a polypeptide of the invention that contain changesin amino acid residues that are not essential for activity. Suchpolypeptides differ in amino acid sequence from the naturally-occurringproteins which correspond to the markers of the invention, yet retainbiological activity. In one embodiment, such a protein has an amino acidsequence that is at least about 40% identical, 50%, 60%, 70%, 80%, 90%,95%, or 98% identical to the amino acid sequence of one of the proteinswhich correspond to the markers of the invention.

An isolated nucleic acid molecule encoding a variant protein can becreated by introducing one or more nucleotide substitutions, additionsor deletions into the nucleotide sequence of nucleic acids of theinvention, such that one or more amino acid residue substitutions,additions, or deletions are introduced into the encoded protein.Mutations can be introduced by standard techniques, such assite-directed mutagenesis and PCR-mediated mutagenesis. Preferably,conservative amino acid substitutions are made at one or more predictednon-essential amino acid residues. A “conservative amino acidsubstitution” is one in which the amino acid residue is replaced with anamino acid residue having a similar side chain. Families of amino acidresidues having similar side chains have been defined in the art. Thesefamilies include amino acids with basic side chains (e.g., lysine,arginine, histidine), acidic side chains (e.g., aspartic acid, glutamicacid), uncharged polar side chains (e.g., glycine, asparagine,glutamine, serine, threonine, tyrosine, cysteine), non-polar side chains(e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine,methionine, tryptophan), beta-branched side chains (e.g., threonine,valine, isoleucine) and aromatic side chains (e.g., tyrosine,phenylalanine, tryptophan, histidine). Alternatively, mutations can beintroduced randomly along all or part of the coding sequence, such as bysaturation mutagenesis, and the resultant mutants can be screened forbiological activity to identify mutants that retain activity. Followingmutagenesis, the encoded protein can be expressed recombinantly and theactivity of the protein can be determined

The present invention encompasses antisense nucleic acid molecules,i.e., molecules which are complementary to a sense nucleic acid of theinvention, e.g., complementary to the coding strand of a double-strandedcDNA molecule corresponding to a marker of the invention orcomplementary to an mRNA sequence corresponding to a marker of theinvention. Accordingly, an antisense nucleic acid of the invention canhydrogen bond to (i.e. anneal with) a sense nucleic acid of theinvention. The antisense nucleic acid can be complementary to an entirecoding strand, or to only a portion thereof, e.g., all or part of theprotein coding region (or open reading frame). An antisense nucleic acidmolecule can also be antisense to all or part of a non-coding region ofthe coding strand of a nucleotide sequence encoding a polypeptide of theinvention. The non-coding regions (“5′ and 3′ untranslated regions”) arethe 5′ and 3′ sequences which flank the coding region and are nottranslated into amino acids.

An antisense oligonucleotide can be, for example, about 5, 10, 15, 20,25, 30, 35, 40, 45, or 50 or more nucleotides in length. An antisensenucleic acid of the invention can be constructed using chemicalsynthesis and enzymatic ligation reactions using procedures known in theart. For example, an antisense nucleic acid (e.g., an antisenseoligonucleotide) can be chemically synthesized using naturally occurringnucleotides or variously modified nucleotides designed to increase thebiological stability of the molecules or to increase the physicalstability of the duplex formed between the antisense and sense nucleicacids, e.g., phosphorothioate derivatives and acridine substitutednucleotides can be used. Examples of modified nucleotides which can beused to generate the antisense nucleic acid include 5-fluorouracil,5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine,4-acetylcytosine, 5-(carboxyhydroxylmethyl)uracil,5-carboxymethylaminomethyl-2-thiouridin-e, 5carboxymethylaminomethyl-uracil, dihydrouracil,beta-D-galactosylqueosine, inosine, N6-isopentenyl-adenine,1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 2-methyladenine,2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine,7-methylguanine, 5-methylaminomethyluracil,5-methoxyamino-methyl-2-thiour-acil, beta-D-mannosylqueosine,5′-methoxycarboxy-methyluracil, 5-methoxyuracil,2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v),wybutoxosine, pseudouracil, queosine, 2-thiocytosine,5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil,uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v),5-methyl-2-thiouracil, 3-(3-amino-3-N-2-carboxypropyl) uracil, (acp3)w,and 2,6-diaminopurine. Alternatively, the antisense nucleic acid can beproduced biologically using an expression vector into which a nucleicacid has been sub-cloned in an antisense orientation (i.e., RNAtranscribed from the inserted nucleic acid will be of an antisenseorientation to a target nucleic acid of interest, described further inthe following subsection).

The antisense nucleic acid molecules of the invention are typicallyadministered to a subject or generated in situ such that they hybridizewith or bind to cellular mRNA and/or genomic DNA encoding a polypeptidecorresponding to a selected marker of the invention to thereby inhibitexpression of the marker, e.g., by inhibiting transcription and/ortranslation. The hybridization can be by conventional nucleotidecomplementarity to form a stable duplex, or, for example, in the case ofan antisense nucleic acid molecule which binds to DNA duplexes, throughspecific interactions in the major groove of the double helix. Examplesof a route of administration of antisense nucleic acid molecules of theinvention includes direct injection at a tissue site or infusion of theantisense nucleic acid into an ovary-associated body fluid.Alternatively, antisense nucleic acid molecules can be modified totarget selected cells and then administered systemically. For example,for systemic administration, antisense molecules can be modified suchthat they specifically bind to receptors or antigens expressed on aselected cell surface, e.g., by linking the antisense nucleic acidmolecules to peptides or antibodies which bind to cell surface receptorsor antigens. The antisense nucleic acid molecules can also be deliveredto cells using the vectors described herein. To achieve sufficientintracellular concentrations of the antisense molecules, vectorconstructs in which the antisense nucleic acid molecule is placed underthe control of a strong pol II or pol III promoter are preferred.

The invention also encompasses ribozymes. Ribozymes are catalytic RNAmolecules with ribonuclease activity which are capable of cleaving asingle-stranded nucleic acid, such as an mRNA, to which they have acomplementary region. Thus, ribozymes can be used to catalyticallycleave mRNA transcripts to thereby inhibit translation of the proteinencoded by the mRNA. A ribozyme having specificity for a nucleic acidmolecule encoding a polypeptide corresponding to a marker of theinvention can be designed based upon the nucleotide sequence of a cDNAcorresponding to the marker.

The invention also encompasses nucleic acid molecules which form triplehelical structures. For example, expression of a polypeptide of theinvention can be inhibited by targeting nucleotide sequencescomplementary to the regulatory region of the gene encoding thepolypeptide (e.g., the promoter and/or enhancer) to form triple helicalstructures that prevent transcription of the gene in target cells.

The invention also encompasses the use of RNA interference or “RNAi”which is a term initially coined by Fire and co-workers to describe theobservation that double-stranded RNA (dsRNA) can block gene expressionwhen it is introduced into worms (Fire et al. (1998) Nature 391,806-811). dsRNA directs gene-specific, post-transcriptional silencing inmany organisms, including vertebrates, and has provided a new tool forstudying gene function.

The phenomenon of RNA interference is described and discussed in Bass,Nature 411: 428-29 (2001); Elbashir et al., Nature 411: 494-98 (2001);and Fire et al., Nature 391: 806-11 (1998), where methods of makinginterfering RNA also are discussed. An “siRNA” or “RNAi” refers to anucleic acid that forms a double stranded RNA, which double stranded RNAhas the ability to reduce or inhibit expression of a gene or target genewhen the siRNA expressed in the same cell as the gene or target gene.“siRNA” thus refers to the double stranded RNA formed by thecomplementary strands. The complementary portions of the siRNA thathybridize to form the double stranded molecule typically havesubstantial or complete identity. In one embodiment, an siRNA refers toa nucleic acid that has substantial or complete identity to a targetgene and forms a double stranded siRNA. The sequence of the siRNA cancorrespond to the full-length target gene, or a subsequence thereof.Typically, the siRNA is at least about 15-50 nucleotides in length(e.g., each complementary sequence of the double stranded siRNA is 15-50nucleotides in length, and the double stranded siRNA is about 15-50 basepairs in length, preferable about preferably about 20-30 basenucleotides, preferably about 20-25 nucleotides in length, e.g., 20, 21,22, 23, 24, 25, 26, 27, 28, 29, or 30 nucleotides in length.

In various embodiments, the nucleic acid molecules of the invention canbe modified at the base moiety, sugar moiety or phosphate backbone toimprove, e.g., the stability, hybridization, or solubility of themolecule. For example, the deoxyribose phosphate backbone of the nucleicacids can be modified to generate peptide nucleic acids. As used herein,the terms “peptide nucleic acids” or “PNAs” refer to nucleic acidmimics, e.g., DNA mimics, in which the deoxyribose phosphate backbone isreplaced by a pseudopeptide backbone and only the four naturalnucleobases are retained. The neutral backbone of PNAs has been shown toallow for specific hybridization to DNA and RNA under conditions of lowionic strength. The synthesis of PNA oligomers can be performed usingstandard solid phase peptide synthesis.

PNAs can be used in therapeutic and diagnostic applications. Forexample, PNAs can be used as antisense or antigene agents forsequence-specific modulation of gene expression by, e.g., inducingtranscription or translation arrest or inhibiting replication. PNAs canalso be used, e.g., in the analysis of single base pair mutations in agene by, e.g., PNA directed PCR clamping; as artificial restrictionenzymes when used in combination with other enzymes.

In other embodiments, the oligonucleotide can include other appendedgroups such as peptides (e.g., for targeting host cell receptors invivo), or agents facilitating transport across the cell membrane or theblood-brain barrier. In addition, oligonucleotides can be modified withhybridization-triggered cleavage agents or intercalating agents. Theoligonucleotide can be conjugated to another molecule, e.g., a peptide,hybridization triggered cross-linking agent, transport agent,hybridization-triggered cleavage agent, etc.

The invention also includes molecular beacon nucleic acids having atleast one region which is complementary to a nucleic acid of theinvention, such that the molecular beacon is useful for quantitating thepresence of the nucleic acid of the invention in a sample. A “molecularbeacon” nucleic acid is a nucleic acid comprising a pair ofcomplementary regions and having a fluorophore and a fluorescentquencher associated therewith. The fluorophore and quencher areassociated with different portions of the nucleic acid in such anorientation that when the complementary regions are annealed with oneanother, fluorescence of the fluorophore is quenched by the quencher.When the complementary regions of the nucleic acid are not annealed withone another, fluorescence of the fluorophore is quenched to a lesserdegree.

Microarrays

In another aspect, the present invention describes the use of highdensity oligonucleotide microarrays or solid supports or microbeads tomeasure in a standardized fashion PCR products following standardizedquantitative RT-PCR according to the methods described herein, as shownin FIGS. 9 a and 9 b.

In certain embodiments, the preparation of high-density oligonucleotidearrays can be made with the following properties. For each gene, anoligonucleotide of any length that will bind with specificity to boththe competitive template, CT, and native template, NT, is spotted of afilter. To identify a suitable oligonucleotide, the region between theforward primer (common to both the NT and CT) and the 3′ 20 bp of thereverse CT primer is evaluated. An oligonucleotide with high meltingtemperature, preferably greater than about 70 degrees centigrade, and beattached to the solid support at a previously designated location InFIG. 9A the oligonucleotides specific to each gene are designated withdifferent bars (open, slashed, or striped).

Then, the CT and NT PCR products, amplified according to the methodsdescribed above, are hybridized to the spots. Each gene (NT and CT) isamplified separately. Then the PCR products are pooled for hybridizationto the membrane described above, and illustrated in FIG. 9A. The CT andNT PCR products appear as thin black curved lines in the FIG. 9A.

Two oligonucleotide probes, each labeled with a different fluor, areprepared for each gene. One oligonucleotide will be homologous to, andwill bind to sequences unique to the NT for a gene that wasPCR-amplified using the methods described herein. This oligonucleotidewill bind to the region of the NT that is not homologous to the CT andwill be labeled with a different fluor. The other oligonucleotide willbe specific to the CT and will be labeled with a different fluor. Itwill be homologous to and will bind to CT sequences that span the 3′ endof the reverse primer. The NT-specific and CT-specific oligonucleotidesfor multiple genes will be mixed in equal amounts and hybridized to thegene-specific PCR products bound to the gene-specific oligonucleotidesspotted on the filter. The ratio between the fluors bound to the spotwill quantify the NT relative to CT. The fluorescent tagged probe(shaded black) is specific to the NT and the fluorescent tagged probe(unshaded) is specific to the CT.

In this assay, although there may be different binding affinitiesbetween the CT and CT probe relative to that between the NT and NTprobe, this difference will be consistent between different samplesassessed, and from one experiment to another.

This method also works with other solid phase hybridization templatesincluding, for example, microbeads, glass slides, or chips prepared byphotolithography. No matter what template is used, the products ofstandardized RT-PCT, using the standardized mixture of competitivetemplates, will be the starting point, as shown with microbeads in FIG.9B where microbeads gene specificity is conferred by the fluorescentcolor of the bead, rather than the location on the microarray.

Cisplatin

The examples set forth below relate to cis-Diamminedichloroplatinum(II), otherwise known as cisplatin, and related compounds. Cisplatin isa chemical compound within a family of platinum coordination complexeswhich are art-recognized as being a family of related compounds.Cisplatin was the first platinum compound shown to have anti-malignantproperties. The language “platinum compounds” is intended to includecisplatin, compounds which are structurally similar to cisplatin, aswell as analogs and derivatives of cisplatin. The language “platinumcompounds” can also include “mimics”. “Mimics” is intended to includecompounds which may not be structurally similar to cisplatin but mimicthe therapeutic activity of cisplatin or structurally related compoundsin vivo.

The platinum compounds of this invention are those compounds which areuseful for inhibiting tumor growth in subjects (patients). More than1000 platinum-containing compounds have been synthesized and tested fortherapeutic properties. One of these, carboplatin, has been approved fortreatment of ovarian cancer. Both cisplatin and carboplatin are amenableto intravenous delivery. However, compounds of the invention can beformulated for therapeutic delivery by any number of strategies. Theterm platinum compounds also is intended to include pharmaceuticallyacceptable salts and related compounds. Platinum compounds havepreviously been described in U.S. Pat. Nos. 6,001,817, 5,945,122,5,942,389, 5,922,689, 5,902,610, 5,866,617, 5,849,790, 5,824,346,5,616,613, and 5,578,571, all of which are expressly incorporated byreference.

Cisplatin and related compounds are thought to enter cells throughdiffusion, whereupon the molecule likely undergoes metabolic processingto yield the active metabolite of the drug, which then reacts withnucleic acids and proteins. Cisplatin has biochemical properties similarto that of bifunctional alkylating agents, producing interstrand,intrastrand, and monofunctional adduct cross-linking with DNA.

Databases

The present invention includes relational numerically standardizeddatabases containing sequence information, for instance for the genes ofTables 1 and 5 (FIGS. 1 and 5, respectively), as well as gene expressioninformation in various lung tissue samples. Databases may also containinformation associated with a given sequence or tissue sample such asdescriptive information about the gene associated with the sequenceinformation, or descriptive information concerning the clinical statusof the tissue sample, or the patient from which the sample was derived.The database may be designed to include different parts, for instance asequences database and a gene expression database. Methods for theconfiguration and construction of such databases are widely available.

The numerically standardized databases of the invention may be linked toan outside or external database. In a preferred embodiment, as describedin Tables 1-5 (FIGS. 1-5), the external database is GenBank and theassociated databases maintained by the National Center for BiotechnologyInformation (NCBI).

Any appropriate computer platform may be used to perform the necessarycomparisons between sequence information, gene expression informationand any other information in the database or provided as an input. Forexample, a large number of computer workstations are available from avariety of manufacturers, such has those available from SiliconGraphics. Client-server environments, database servers and networks arealso widely available and appropriate platforms for the databases of theinvention.

The databases of standardized numerical data of the invention may beused to produce, among other things, electronic Northerns to allow theuser to determine the cell type or tissue in which a given gene isexpressed and to allow determination of the abundance or expressionlevel of a given gene in a particular tissue or cell.

The databases of the invention may also be used to present informationidentifying the expression level in a tissue or cell of a set of genescomprising at least one gene in Tables 1-5 (FIGS. 1-5) comprising thestep of comparing the expression level of at least one gene in Tables1-(FIGS. 1-5) in the tissue to the level of expression of the gene inthe database. Such methods may be used to predict the physiologicalstate of a given tissue by comparing the level of expression of a geneor genes in Tables 1-5 (FIGS. 1-5) from a sample to the expressionlevels found in tissue from normal lung, malignant lung or NSCLC. Suchmethods may also be used in the drug or agent screening assays asdescribed below.

Computer System

In another aspect, the present invention relates to a computer systemcomprising: (a) a database containing standardized numerical geneexpression information identifying the expression level in lung tissueof a set of genes comprising at least two genes in Tables 1 and 5 (FIGS.1 and 5, respectively) or c-mycxE2F-a/p21; and (b) a user interface toview the information. The database can further include at least one ormore of the following: sequence information for the genes; informationidentifying the expression level for the set of genes in normal lungtissue; information identifying the expression level of the set of genesin non small cell cancer tissue, records including descriptiveinformation from an external database, which information correlates saidgenes to records in the external database; including, for example, wherethe external database is GenBank and information or specificcharacteristics of the cells or tissues or patients from which the werederived.

In another aspect, the present invention relates to a method of usingthe computer system described above to present information identifyingthe expression level in a tissue or cell of at least one gene in Tables1 and 5 (FIGS. 1 and 5, respectively), by comparing the expression levelof at least one gene in Tables 1 and 5 in the tissue or cell to thelevel of expression of the gene in the database. In certain embodiments,the expression level of at least two, five, seven, and/or ten genes arecompared.

In yet other aspects, the method further includes displaying the levelof expression of at least one gene in the tissue or cell sample comparedto the expression level in lung cancer.

Kits

The invention further includes kits combining, in differentcombinations, at least one of: high-density oligonucleotide arrays,reagents for use with the microarrays, reagents for StaRT-PCRamplification of the specified genes including gene specific primers andstandardized mixtures of internal standards, signal detection andarray-processing instruments, gene expression databases, and analysisand database management software described above. The kits may be used,for example, to predict or model the toxic response of a test compound,to monitor the progression of disease states, to identify genes thatshow promise as new drug targets and to screen known and newly designeddrugs as discussed herein.

In certain embodiments, the kit includes at least one solid support, asdescribed herein, packaged with gene expression information for saidgenes. In certain embodiments, the gene expression information comprisesgene expression levels in a tissue or cell sample exposed to a toxin.Also, in certain embodiments, the gene expression information is in anelectronic format, including, for example, the standardized geneexpression database described herein.

The databases packaged with the kits are a compilation of expressionpatterns from human or laboratory animal genes and gene fragments(corresponding to the genes of Tables 1 and 5). Data is collected from arepository of both normal and diseased tissues and providesreproducible, quantitative results, i.e., the degree to which a gene isup-regulated or down-regulated under a given condition.

The kits are useful in the pharmaceutical industry, where the need forearly drug testing is strong due to the high costs associated with drugdevelopment, but where bioinformatics, in particular gene expressioninformatics, is still lacking. These kits reduce the costs, time andrisks associated with traditional new drug screening using cell culturesand laboratory animals. The results of large-scale drug screening ofpre-grouped patient populations, pharmacogenomics testing, can also beapplied to select drugs with greater efficacy and fewer side-effects.The kits may also be used by smaller biotechnology companies andresearch institutes who do not have the facilities for performing suchlarge-scale testing themselves.

Databases and software designed for use with microarrays is discussed inBalaban et al., U.S. Pat. No. 6,229,911, a computer-implemented methodfor managing information, stored as indexed tables, collected from smallor large numbers of microarrays, and U.S. Pat. No. 6,185,561, acomputer-based method with data mining capability for collecting geneexpression level data, adding additional attributes and reformatting thedata to produce answers to various queries. Chee et al., U.S. Pat. No.5,974,164, disclose a software-based method for identifying mutations ina nucleic acid sequence based on differences in probe fluorescenceintensities between wild type and mutant sequences that hybridize toreference sequences.

Assays and Identification of Therapeutic and Drug Screening Targets

It should be understood that in certain preferred embodiments, themicroarrays as described herein, and in particular, with reference tothe example shown in FIGS. 9 a and 9 b, are especially useful. However,it should also be understood, that in certain other embodiments, otherhybridization assay format may be used, including solution-based andsolid support-based assay formats. Solid supports containingoligonucleotide probes for differentially expressed genes of theinvention can be filters, polyvinyl chloride dishes, silicon or glassbased chips, etc. Such wafers and hybridization methods are widelyavailable. Any solid surface to which oligonucleotides can be bound,either directly or indirectly, either covalently or non-covalently, canbe used. Examples of a solid support include a high density array or DNAchip. These contain a particular oligonucleotide probe in apredetermined location on the array. Each predetermined location maycontain more than one molecule of the probe, but each molecule withinthe predetermined location has an identical sequence. Such predeterminedlocations are termed features. There may be, for example, about 2, 10,100, 1000 to 10,000; 100,000 or 400,000 of such features on a singlesolid support. The solid support, or the area within which the probesare attached may be on the order of a square centimeter.

Oligonucleotide probe arrays for expression monitoring can be made andused according to any techniques known in the art. Such probe arrays maycontain at least two or more oligonucleotides that are complementary toor hybridize to two or more of the genes described herein. Such arraysmay also contain oligonucleotides that are complementary or hybridize toat least about 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 50, 70, 100 or morethe genes described herein.

Methods of forming high density arrays of oligonucleotides with aminimal number of synthetic steps are known. The oligonucleotideanalogue array can be synthesized on a solid substrate by a variety ofmethods, including, but not limited to, light-directed chemicalcoupling, and mechanically directed coupling. In brief, thelight-directed combinatorial synthesis of oligonucleotide arrays on aglass surface proceeds using automated phosphoramidite chemistry andchip masking techniques. In one specific implementation, a glass surfaceis derivatized with a silane reagent containing a functional group,e.g., a hydroxyl or amine group blocked by a photolabile protectinggroup. Photolysis through a photolithogaphic mask is used selectively toexpose functional groups which are then ready to react with incoming 5′photoprotected nucleoside phosphoramidites. The phosphoramidites reactonly with those sites which are illuminated (and thus exposed by removalof the photolabile blocking group). Thus, the phosphoramidites only addto those areas selectively exposed from the preceding step. These stepsare repeated until the desired array of sequences have been synthesizedon the solid surface. Combinatorial synthesis of differentoligonucleotide analogues at different locations on the array isdetermined by the pattern of illumination during synthesis and the orderof addition of coupling reagents.

In addition to the foregoing, additional methods can be used to generatean array of oligonucleotides on a single substrate. High density nucleicacid arrays can also be fabricated by depositing premade or naturalnucleic acids in predetermined positions. Synthesized or natural nucleicacids are deposited on specific locations of a substrate by lightdirected targeting and oligonucleotide directed targeting. Anotherembodiment uses a dispenser that moves from region to region to depositnucleic acids in specific spots.

Determination of IGEI

A sample of cancerous cells with unknown sensitivity to a given drug isobtained from a patient. An expression level is measured in the samplefor a gene corresponding to one of the nucleotide sequences claimedherein as a (IGEI) marker set. The expression level of the marker in thesample is compared with the expression level of the marker measuredpreviously in cells with known drug sensitivity. If the expression levelof the marker in the sample is most similar to the expression levels ofthe marker in cells with low sensitivity to the given drug, then lowsensitivity to that drug is predicted for the sample. If the expressionlevel of the marker in the sample is most similar to the expressionlevels of the marker in cells with medium sensitivity to the given drug,then medium sensitivity to that drug is predicted for the sample. If theexpression level is most similar to the expression levels of the markerin cells with high sensitivity to the given drug, then high sensitivityto that drug is predicted for the sample.

Thus, by examining the expression of one or more of the identifiedmarkers in a sample of cancer cells, it is possible to determine whichtherapeutic agent(s), or combination of agents, to use as theappropriate treatment agents.

By examining the expression of one or more of the identified markers ina sample of cancer cells taken from a patient during the course oftherapeutic treatment, it is also possible to determine whether thetherapeutic agent is continuing to work or whether the cancer has becomeresistant (refractory) to the treatment protocol. These determinationscan be made on a patient-by-patient basis or on an agent by agent (orcombinations of agents). Thus, one can determine whether or not aparticular therapeutic treatment is likely to benefit a particularpatient or group/class of patients, or whether a particular treatmentshould be continued.

The identified (IGEI) marker sets further provide previously unknown orunrecognized targets for the development of anti-cancer agents, such aschemotherapeutic compounds, and can be used as targets in developingsingle agent treatment as well as combinations of agents for thetreatment of cancer.

EXAMPLES

A skilled artisan can readily recognize that there is no limit as to thestructural nature of the agents of the present invention. As such,without further description, it is believed that one of ordinary skillin the art can, using the preceding description and the followingillustrative examples, make and utilize the compounds of the presentinvention and practice the claimed methods. The following workingexamples therefore, specifically point out the preferred embodiments ofthe present invention, and are not to be construed as limiting in anyway the remainder of the disclosure.

In one embodiment, standardized RT (StaRT)-PCR, was employed to assessvarious multidrug resistant genes in a set of non-small cell lung cancer(NSCLC) cell lines with a previously determined range of sensitivity tocisplatin. Data were obtained in the form of target gene moleculesrelative to 10⁶ β-actin (ACTB) molecules. To cancel the effect of ACTBvariation among the different cells lines individual gene expressionvalues were incorporated into ratios of one gene to another. Eachtwo-gene ratio was compared as a single variable to chemoresistance foreach of eight NSCLC cell lines using multiple regression. Followingvalidation, single variable models best correlated with chemoresistance(p<0.001), were determined. In certain embodiments, the variable modelsincluded: ERCC2/XPC, ABCC5/GTF2H2, ERCC2/GTF2H2, XPA/XPC and XRCC1/XPC.All single variable models were examined hierarchically to achieve twovariable models. The two-variable model with the highest correlation was(ABCC5/GTF2H2, ERCC2/GTF2H2) with an R2 value of 0.96 (p<0.001). Incertain embodiments, these markers are suitable for assessment of smallsamples of tissue such as fine needle aspirate biopsies to prospectivelyidentify cisplatin resistant tumors.

StaRT-PCR is used to measure expression of 35 genes involved in DNArepair, multi-drug resistance, cell cycling and apoptosis in two celllines previously reported to be the least (H460) and most (H1435)chemoresistant among 20 NSCLC cell lines. Weaver, D. A., Zahorchak, R.,Varnavas, L., Crawford, E. L., Warner, K. A., Willey, J. C., Comparisonof expression patterns by microarray and standardized RT-PCR analyses inlung cancer cell lines with varied sensitivity to carboplatin, Proc AmAssoc Cancer Res (2001) abstract, 42, 606. Tsai, C. M., Chang, K. T.,Wu, L. H., Chen, J, Y., Gazdar, A. F., Mitsudomi, T., Chen, M. H.,Perng, R. P., Correlations between intrinsic chemoresistance andHER-2/neu gene expression, p53 mutations, and cell proliferationcharacteristics in non-small cell lung cancer cell lines, Cancer Res(1996), 56, 206-109. Genes involved in DNA repair (ERCC2, XRCC1) anddrug influx/efflux (ABCC5) are associated with chemoresistance. Thenumber of genes from each of these two categories was expanded toinclude additional representative genes associated with generalized DNAdamage recognition and repair (DDIT3), associated specifically with NER(LIG1, ERCC3, GTF2H2, XPA, XPC), or associated with drug transport(ABCC1, ABCC4, ABCC10). Expression of these twelve genes was measured ineight NSCCLC cell lines with variable cisplatin resistance. Tsai, C. M.,Chang, K. T., Wu, L. H., Chen, J, Y., Gazdar, A. F., Mitsudomi, T.,Chen, M. H., Perng, R. P., Correlations between intrinsicchemoresistance and HER-2/neu gene expression, p53 mutations, and cellproliferation characteristics in non-small cell lung cancer cell lines,Cancer Res (1996), 56, 206-109. StaRT-PCR data were obtained using ACTBas a reference gene. Thus, data were reported in the form of mRNAmolecules/106 ACTB molecules. These data then were combined intointeractive gene expression indices (IGEI) by placing one or more genesdirectly associated with the phenotype on the numerator and one or moregenes negatively associated with the phenotype on the denominator usingthe quantitative reverse transcriptase-PCR method described in theWilley U.S. Pat. Nos. 5,639,606; 5,643,765; and 5,876,978. Willey, J.C., Crawford, E. L., Jackson, C. M., Weaver, D. A., Hoban, J. C.,Khuder, S. A., DeMuth, J. P., Expression measurement of many genessimultaneously by quantitative RT-PCR using standardized mixtures ofcompetitive templates, Am J Respir Cell Mol Biol (1998), 19, 6-17.DeMuth, J. P., Jackson, C. M., Weaver, D. A., Crawford, E. L.,Durzinsky, D. S., Durham, S. J., Zaher, A., Philips, E. R., Khuder, S.A., Willey, J. C., The gene expression index c-mycxE2F1/p21 is highlypredictive of malignant phenotype in human bronchial epithelial cells,Am J Respir Cell Mol Biol (1998), 19, 18-24. The IGEI are greaterpredictors of phenotypes than are the expression levels of individualgenes. For certain cancer-related phenotypes. Willey, J. C., Crawford,E. L., Jackson, C. M., Weaver, D. A., Hoban, J. C., Khuder, S. A.,DeMuth, J. P., Expression measurement of many genes simultaneously byquantitative RT-PCR using standardized mixtures of competitivetemplates, Am J Respir Cell Mol Biol (1998), 19, 6-17. DeMuth, J. P.,Jackson, C. M., Weaver, D. A., Crawford, E. L., Durzinsky, D. S.,Durham, S. J., Zaher, A., Philips, E. R., Khuder, S. A., Willey, J. C.,The gene expression index c-mycxE2F1/p21 is highly predictive ofmalignant phenotype in human bronchial epithelial cells, Am J RespirCell Mol Biol (1998), 19, 18-24. Crawford, E. L., Khuder, S. A., Durham,S. J., Frampton, M., Utell, M., Thilly, W. G., Weaver, D. A., Ferencak,W. J., Jennings, C. A., Hammersley, J. R., Olson, D. A., Willey, J. C.,Normal bronchial epithelial cell expression of glutathione transferaseP1, glutathione transferase M3, and glutathione peroxidase is low insubjects with bronchogenic carcinoma, Cancer Res (2000), 60, 1609-1618.Rots, J. G., Willey, J. C., Jansen, G., Van Zantwijk, C. H., Noordhuis,P., DeMuth, J. P., Kuiper, E., Verrman, A. J., Pieters, R., Peters, G.J., mRNA expression levels of methotrexate resistance-related proteinsin childhood leukemia as determined by a standardized competitivetemplate-based RT-PCR method, Leukemia (2000), 14, 2166-2175. A furtheradvantage of IGEI is that they control for previously observed variationin the reference gene value (in this case, ACTB) from one cell line toanother. Willey, J. C., Crawford, E. L., Jackson, C. M., Weaver, D. A.,Hoban, J. C., Khuder, S. A., DeMuth, J. P., Expression measurement ofmany genes simultaneously by quantitative RT-PCR using standardizedmixtures of competitive templates, Am J Respir Cell Mol Biol (1998), 19,6-17. DeMuth, J. P., Jackson, C. M., Weaver, D. A., Crawford, E. L.,Durzinsky, D. S., Durham, S. J., Zaher, A., Philips, E. R., Khuder, S.A., Willey, J. C., The gene expression index c-mycxE2F1/p21 is highlypredictive of malignant phenotype in human bronchial epithelial cells,Am J Respir Cell Mol Biol (1998), 19, 18-24. When a single gene in thenumerator is divided by another single gene in the denominator, thereference value mathematically cancels out. The IGEI values werecompared to cisplatin chemoresistance among the eight NSCLC cell lineswith variable resistance. Results then were validated in an additionalsix NSCLC cell lines.

Example I Materials and Methods

Cell Culture

Non-small cell lung cancer (NSLC) cell lines H460, H1155, H23, H838,H1334, H1437, H1355, H1435, H358, H322, H441, H522, H226 and H647 wereobtained from the American Type Culture Collection (Rockville, Md.). Allcells were incubated in RPMI-1640 medium (Biofluids, Inc., Rockville,Md.) containing 10% fetal bovine serum (FBS) and 1 mM glutamine at 37°C. in the presence of 5% CO2. Proliferative, subconfluent cultures wereobtained from RNA extractions and subsequent analyses.

Reagents

10×PCR buffer for the Rapidcycler (500 mM Tris, pH 8.3; 2.5 mg/μl BSA;30 mM MgCl2) was obtained from Idaho Technology, Inc. (Idaho Falls,Id.). Taq polymerase (5 U/μl), oligo dT primers, RNasin (25 U/μl) anddNTPs were obtained from Promega (Madison, Wis.). M-MLV reversetranscriptase (200 U/μl) and 5× first strand buffer (250 mM Tris-HCl, pH8.3; 375 mM KCl; 15 mM MgCl2-; 50 mM DTT) were obtained from Gibco BRL(Gaithersburg, Md.). DNA 7500 Assay kits containing dye, matrix andstandards were obtained from Agilent Technologies, Inc. (Palo Alto,Calif.). All other chemicals and reagents were molecular biology grade.

RNA Extraction and Reverse Transcription

Total RNA was isolated from cell cultures by a TriReagent protocol(Molecular Research Center, Inc., Cincinnati, Ohio). Chomczynski, P., Areagent for the single-step simultaneous isolation of RNA, DNA andproteins from cell and tissue samples, Biotechniques (1993), 15,536-537. Following extraction, approximately 1 μg of total RNA for eachcell line was reverse-transcribed using M-MLV reverse-transcriptase andan oligo dT primer as previously described in Willey, J. C., Coy, E.,Brolly, C., Utell, M. J., Frampton, M. W., Hammersley, J., Thilly, W.G., Olson, D., Cairns, K., Xenobiotic metabolism enzyme gene expressionin human bronchial epithelial and alveolar macrophage cells, Am. J.Respir. Cell Biol. (1996), 14, 262-271.

Quantitative Standardized RT (StaRT)-PCR

Gene expression was determined using quantitative StaRT-PCR protocolsdescribed in U.S. Pat. Nos. 5,639,606; 5,643,765; and 5,876,978 and inWilley, J. C., Crawford, E. L., Jackson, C. M., Weaver, D. A., Hoban, J.C., Khuder, S. A., DeMuth, J. P., Expression measurement of many genessimultaneously by quantitative RT-PCR using standardized mixtures ofcompetitive templates, Am J Respir Cell Mol Biol (1998), 19, 6-17.Willey, J. C., Coy, E., Brolly, C., Utell, M. J., Frampton, M. W.,Hammersley, J., Thilly, W. G., Olson, D., Cairns, K, Xenobioticmetabolism enzyme gene expression in human bronchial epithelial andalveolar macrophage cells, Am J Respir Cell Biol (1996), 14, 262-271.Apostolakos, M. J., Schuermann, W. H., Frampton, M. W., Utell, M. J.,Willey, J. C., Measurement of gene expression by multiplex competitivepolymerase chain reaction, Anal. Biochem. (1993), 213, 277-284. Willey,J. C., Coy, E. L. Frampton, M. W., Torres, A., Apostolakos, M. J., HoehnG., Schuermann, W. H. Thilly W. G., Olson, D. E., Hammersley, J. R.,Crepsi, C. L. Utell, M. J., Quantitative RT-PCR measurement ofcytochromes p4a50 1A1, 1B1, and 2B7, microsomal epoxide hydrolase, andNADPH oxidereductase expression in lung cells of smokers andnon-smokers. Am. J. Respir. Cell Mol. Biol. (1997) 17, 114-124. Briefly,a master mixture containing buffer, MgC12, dNTPs, sample cDNA, Taqpolymerase and competitive template (CT) mixture was prepared and 9 μlaliquots dispensed into 0.6 ml microfuge tubes containing 1 μl ofgene-specific primers. The CT mixture comprises gene-specific internalstandard competitive templates (CTs) at defined concentrations relativeto one another and also contains CT for a housekeeping gene, ACTB, toallow for the normalization of all specific gene data. All primers usedfor PCT and those used in the construction of the CTs, are listed inTable 1 (FIG. 1). PCR reactions were subjected to 35 cycles of PCR with5 seconds of denaturation at 94° C., 10 seconds of annealing at 58° C.and 15 seconds of elongation at 72° C. in a Rapidcycler (IdahoTechnology, Inc.). PCR products were electrophoretically separated andquantified in an Agilent 2100 Bioanalyzer (Agilent Technologies, Inc.)with the DNA 7500 Assay Kit.

Chemoresistance of NSCLC Cell Lines

Chemoresistance IC50 (μm) values of the NSCLC cell lines used forseveral chemotherapeutic agents were previously determined, as describedin Tsai, C. M., Chang, K. T., Wu, L. H., Chen, J, Y., Gazdar, A. F.,Mitsudomi, T., Chen, M. H., Perng, R. P., Correlations between intrinsicchemoresistance and HER-2/neu gene expression, p53 mutations, and cellproliferation characteristics in non-small cell lung cancer cell lines,Cancer Res (1996), 56, 206-109 and are summarized for cisplatin in Table2 (FIG. 2).

Statistical Analyses

Ratios of one gene to another, from each of the initial eight NSCLC celllines, were subjected to multiple regression analysis with SAS (version6, 4th edition, volume 2) statistical package (SAS Institute Inc., Cary,N.C.) to determine the combination of genes that best predict cisplatinresistance. Each ratio was compared separately to chemoresistance andratios with significant correlation to resistance (R20.88, p<0.001) thenwere examined hierarchically to achieve two variable models based on thehighest R2 values. Following assessment of an additional 6 cell lines,results for all 14 NSCLC cell lines were combined and subjected toanalysis as described.

Results: Reproducibility

Among the gene expression measurements for which three or more replicatevalues were obtained, the mean coefficient of variation was 38.5% (rawdata available at website). This is similar to the reproducibilityobserved in other gene expression studies using the StaRT-PCR method.Willey, J. C., Crawford, E. L., Jackson, C. M., Weaver, D. A., Hoban, J.C., Khuder, S. A., DeMuth, J. P., Expression measurement of many genessimultaneously by quantitative RT-PCR using standardized mixtures ofcompetitive templates, Am. J. Respir. Cell Mol. Biol. (1998), 19, 6-17.Crawford, E. L., Khuder, S. A., Durham, S. J., Frampton, M., Utell, M.,Thilly, W. G., Weaver, D. A., Ferencak, W. J., Jennings, C. A.,Hammersley, J. R., Olson, D. A., Willey, J. C., Normal bronchialepithelial cell expression of glutathione transferase P1, glutathionetransferase M3, and glutathione peroxidase is low in subjects withbronchogenic carcinoma, Cancer Res. (2000), 60, 1609-1618.

Individual Gene Expression Measurements and Chemoresistance

The results of the direct comparison of individual gene expression meanvalues versus cisplatin chemoresistance for the first set of eight celllines (Group 1) are presented in Table 3 (FIG. 3). All StaRT-PCR datavalues were in the form of molecules/10⁶ ACTB molecules. For 8/12 genesassessed, there was significant (p<0.05) correlation.

Establishment of Inter-Active Gene Expression Ratios

IGEI were established comprising every possible combination of theexpression value of one gene divided by the expression value of anothergene for data obtained from each of the initial eight NSCLC cell lines(Group 1). Each expression value was calculated as molecules/10⁶ ACTBmolecules. Thus, in these IGEI the effect of the reference gene, ACTB,is cancelled. For Example:

ERCC2 molecules/10⁶ ACTB molecules÷XPC molecules/10⁶ ACTBmolecules=ERCC2 molecules/XPC molecules.

Bivariate analysis of each two-gene ratio versus corresponding cisplatinIC50 chemoresistance values was conducted among the eight cell lines(FIG. 4-Table 4). There were 12 genes assessed and 11 sets of ratios foreach gene resulting in 132 ratios. The sets of 11 ratios for each genethen were organized in descending order such that the ratio set listedfirst was that for which the average correlation with chemoresistancewas highest, and the ratio set listed last was that for which theaverage correlation with chemoresistance was lowest. Thus the ratio setwith ERCC2 in the numerator is listed first because the average of the rvalues for the ratios between ERCC2 and each of the other eleven geneswas the most positive among the twelve genes evaluated. In contrast, theratio set with XPC in the numerator is listed last because the ratiosbetween XPC and each of the other 11 genes had the most negativecorrelation with chemoresistance.

Modeling of Gene Expression with Chemoresistance

The ratios ERCC2/XPC, ABCC5/GTF2H2, ERCC2/XRCC1, ERCC2/GTF2H2, XPA/XPC,XRCC1/XPC, and ABCC5/XPC were the best single variable models (i.e.,those with R2>0.87) identified in the initial eight NSCLC cell lines bysimple linear regression (FIG. 5-Table 5). The effect of adding a secondvariable into the model was then assessed. The best two variable modelwas (ABCC5/GTF2H2, ERCC2/GTF2H2) with an R2 value of 0.96.

Validation of Models

These single and two variable models were tested in an additional sixNSCLC cell lines. From the statistical analysis of the combined data forall 14 NSCLC cell lines, the p value improved or stayed the same forthree of the single variable models (ERCC21XPC, ABCC5/GTF2H2,XRCC1/XPC), as well as the two variable model. The decline in p valuefor ERCC2/GTF2H2 and XPA/XPC was small and not significant. In contrast,ERCC2/XRCC1 was no longer significantly associated with chemoresistance,and the p value declined substantially for ABCC5/XPC.

Analysis of Results

The results obtained by measuring gene expression with StaRT-PCR,incorporating values for individual genes into IGEI, and correlatingIGEI with chemoresistance provides several models useful as predictorsof cisplatin chemoresistance in cultured NSCLC cells. These modelscomprise genes associated with cisplatin chemoresistance, includingABCC5, ERCC2, XPA, and XRCC1. Increased expression of ABCC5, also knownas MRP5, is associated with exposure to platinum drugs in lung cancer invivo and/or the chronic stress response to xenobiotics. Thus, increasedresistance to platinum drugs with increased ABCC5 levels may be due toglutathione S-platinum complex efflux.

The remaining genes directly associated with chemoresistance, XPA andERCC2, are components of the nucleotide excision repair (NER) mechanismwhich generally is recognized as the major repair response to DNA damageinduced by chemotherapeutic agents such as cisplatin. In NER, XPA is themain DNA lesion recognition protein (Asahina, H., Kuraoka, I.,Shirakawa, M., Morita, E. H., Miura, N., Miyamoto, I., Ohtsuka, E.,Okada, Y., Tanaka, K., The XPA protein is a zinc metalloprotein with anability to recognize various kinds of DNA damage, Mutat. Res. DNA Repair(1994), 315, 229-237) and is the key element in assembly of the NERcomplex by recruiting several other proteins to the lesion site. Li, L.,Peterson, C. A., Lu, X., Legerski, R. J., Mutations in XPA that preventassociation with ERCC1 are defective in nucleotide excision repair, MolCell Biol (1995), 15, 1993-1998. Enhanced NER gene expression has beenshown to be a major cause of resistance to cisplatin and otherDNA-damaging chemotherapeutic agents (Zamble, D. B., Lippard, S. J.,Cisplatin and DNA repair in cancer chemotherapy, Trends Biochem. Sci.(1995), 20, 435-439, Reed, E., Anticancer drugs: platinum analogs. In:Cancer: Principles and Practice of Oncology, (1993), 390-399. Editors V.T. Devita, Jr., S. Hellman and S. A. Rosenberg, Lippincott,Philadelphia) and overexpression of the XPA gene component of NER hasbeen associated with resistance to cisplatin in human ovarian cancer.Dabholkar, M., Vionnet, J., Bostick-Bruton, F., Yu, J. J., Reed, E.,Messenger RNA levels of XPAC and ERCC1 in ovarian cancer tissuecorrelate with response to platinum-based chemotherapy, J. Clin. Invest.(1994), 94, 703-708. ERCC2 specifically is a component of thetranscription factor IIH (TFI1H) which consists of seven polypeptides(Mu, D., Park, C. H., Matsunaga, T., Hsu, D. S., Reardon, J. T., Sancar,A., Reconstitution of human DNA repair excision nuclease in a highlydefined system, J. Biol. Chem. (1995), 270, 2415-2418, Mu, D., Hus, D.S., Sancar, A., Reaction mechanism of human DNA repair excisionnuclease, J. Biol. Chem. (1996), 271, 8285-8294) and in its entirety isa repair factor. Schaeffer, L., Moncollin, V., Roy, R., Staub, A.,Mezzina, M., Sarasin, A., Weeda, G., Hoeijmakers, J. H., Egly, J. M.,The ERCC2/DNA repair protein is associated with the class II BTF2/TFIIHtranscription factor, EMBO J. (1994), 13, 2388-2392, Drapkin, R.,Reardon, J. T., Ansari, A., Huang, J. C., Zawel, L., Ahn, K., Sancar,A., Reinberg, D., Dual role of TFIIH in DNA excision repair and intranscription by RNA polymerase II, Nature (1994), 368, 769-772, Wang,Z., Svejstrup, J. Q., Feaver, W. J., Wu, X., Kornberg, R. D., Friedberg,E. C., Transcription factor b (TFI1H) is required duringnucleotide-excision repair in yeast, Nature (1994), 368, 74-76. In NER,ERCC2 (or XPD) is essential for TFIIH helicase activity (Prakash, S.,Sung, P., Prakash, L., DNA repair genes and proteins of Saccharoycescerevisiae, Annu. Rev. Genet. (1993), 27, 33-70), and it has beendemonstrated more recently that ERCC2 interacts specifically with GTF2H2(or p44) and this interaction results in the stimulation of the 5′ to 3′helicase activity. Coin, F., Marinoni, J. C., Rodoflo, C., Fribourg, S.,Pedrinin, A. M., Egly, J. M., Mutations in the XPD helicase gene resultin XP and TTD phenotypes, preventing interaction between XPD and the p44subunit of TFIIH, Nature Genet, 20, 184-188.

With microarray analysis, because thousands of genes are assessedsimultaneously, an index of all genes measured provides a stablereference for the amount of sample loaded from one microarray toanother. In quantitative RT-PCR studies, typically, a singlenon-regulated gene is used as a loading reference, such as ACTB, GAPDH,cyclophilin or ribosomal RNA. However, all of these genes have beenreported to vary among multiple samples. One way to assess inter-samplevariation in reference gene expression among multiple samples is tocompare variation between two reference genes. β-actin and GAPDH vary50-fold relative to each other among bronchial epithelial cells (BEC)and even more between BEC and other cell types. Willey, J. C., Crawford,E. L., Jackson, C. M., Weaver, D. A., Hoban, J. C., Khuder, S. A.,DeMuth, J. P., Expression measurement of many genes simultaneously byquantitative RT-PCR using standardized mixtures of competitivetemplates, Am. J. Respir. Cell Mol Biol. (1998), 19, 6-17. Rots, J. G.,Willey, J. C., Jansen, G., Van Zantwijk, C. H., Noordhuis, P., DeMuth,J. P., Kuiper, E., Verrman, A. J., Pieters, R., Peters, G. J., mRNAexpression levels of methotrexate resistance-related proteins inchildhood leukemia as determined by a standardized competitivetemplate-based RT-PCR method, Leukemia (2000), 14, 2166-2175. Insituations where limited numbers of genes are measured (<200), an indexof all genes for the normalization of data is not sufficiently stable.In order to eliminate the effect of unknown variation in the referencegene expression among samples, balanced ratios of one gene expressionvalue obtained by StaRT-PCR to another were analyzed. These balancedratios did not represent actual cellular concentration changes of theindividual genes comprising the ratio, but related the expression ofgene to another and are used for comparison with phenotypic determinantssuch as chemoresistance. In this study, IGEI analysis (FIG. 5—Table 5)confirmed most of the results obtained by analysis of individual geneexpression values relative to chemoresistance (FIG. 3—Table 3).Specifically, XPC was the most stable of the twelve genes assessedrelative to chemoresistance and the same eight genes were correlatedwith chemoresistance using XPC as the denominator (FIG. 4—Table 4) aswas the case using β-actin as the denominator (FIG. 3—Table 3). Thus,variation in β-actin among this group of cDNA samples was notsignificant. In certain embodiments, it is useful to use IGEI to removedoubt regarding potential effect of variation in reference geneexpression whenever possible.

As is presented in Table 4 (FIG. 4), by evaluating an empiricallyderived set of balanced ratios (IGEI) derived from expression values forall of the genes measured, it is possible to establish a hierarchyregarding the strength of association between a set of genes and aphenotype. Further, bivariate correlation of each gene relative to eachof the others markedly increases the power of the analysis and helps toidentify potential outliers that require further validation. In theexample herein, the most obvious outlier is the high correlation betweenERCC2/XRCC1 and chemoresistance. This is an outlier because (a) the setsof ratios with ERCC2 or XRCC1 in the numerator had the highest andfourth highest range r values respectively (FIG. 4-Table 4), yet (b) allof the other ratios with ERCC2 in the numerator that had high r valueshad genes from the bottom of Table 4 (FIG. 4) in the denominator (i.e.XPC, GTF2H2, ABCC10, ERCC3, and Lig1 all were among the lowest in thetable). Consistent with the evidence that ERCC2/XRCC1 is an outlier,when the Group 2 cell lines were evaluated, ERCC2/XRCC1 was no longersignificantly associated with chemoresistance (FIG. 5—Table 5). Thesefindings provide further evidence for the value of measuring geneexpression in standard, numerical format.

Thus, the association of ERCC2, ABCC5, XPA, and XRCC1 withchemoresistance is established through a sequential process involving(a) a first round of screening genes representing many differentfunctional classes, (b) evaluating an expanded group of genesrepresented by those that are positively associated in the first round,(c) combining the positively connected data into interactive geneexpression indices (IGEI), (d) using IGEI analysis to identify outliers,(e) building a model and (f) validating the data.

The method of the present invention highlights the necessity to evaluatethe interaction of more than one gene involved in cisplatinchemoresistance and the interaction of multiple pathways that may giverise to chemoresistance.

Example II

The identification of many genes and their association to specificphenotypes will most likely lead to molecular cancer classification(Venter, J. C. The sequence of the human genome, Science, 291:1304-1351(2001), Lander, E. S., Initial sequencing and analysis of the humangenome, Nature, 409:860-921 (2001). This novel classification system hasimportant clinical implications and may greatly improve patient care.Specifically, recognition of certain genotypes with associatedphenotypes may reveal individual prognostic markers, chemosensitivitytraits, and predict patient outcome. Molecular classification of lungcancer may greatly enhance cytologic diagnosis. Lung cancer is stillprimarily diagnosed using histopathological criteria. The heterogeneityof lung tumors often leads to inconsistent diagnosis (Sorenson, J. B.,Hirsch, F. R., Gazdar, A., and Olsen, J. E., Cancer, 71:2971-2976,1993), including difficulty distinguishing malignant from normal andmetastatic lung tumors from primary tumors (Shirakusa, T., Tsutsui, M.,Motomaga, R. Ando, K. and Kusano T., A. Surg., 54:655-658, 1966; Fling,A. and Lloyd, R. V., Arch. Pathol. Lab. Med. 166: 39-42, 1992).

Gene expression patterns have clarified clinical outcomes in lung andbreast cancer patients. Garber et al., Proc. Natl. Acad. Sci. 98:13874-113789 (2001) reported gene expression profiles of lung tumorscorrelated with transitional morphological classification. In addition,based on gene expression patterns, adenocarcinomas were further dividedinto 3 subtypes that differed significantly in patient survival.Bhattacharjee et al., Proc. Natl. Acad. Sci., 98: 13799990-13795 (2001)reported similar results. Lung adenocarcinomas were grouped into 4subclasses based on gene expression patterns, and patients hadstatistically significant differences in survival. They also identifiedthree metastatic lung tumors based on gene expression profile that weremorphologically identified as primary lung tumors. Molecularclassification of breast cancer tumors based on gene expression profilesand correlation to patient outcome and cell proliferation rates havealso been reported in cases of hereditary breast cancer, sporadic breastcancer and human mannary epithelial cells (Hendenfalk, et al., New Eng.J. of Med. 344: 539-548, 2001; Sorlie et al., Proc. Natl. Acad. Sci.98:10869-10874, 2001; and Perou et al., Distinctive gene expressionpatterns in human mammary epithelial cells and breast cancers Proc.Natl. Acad. Sci. USA vol.96, no. 16: 9212-9217, 1999).

Most lung cancers are diagnosed primarily by fine-needle aspirate (FNA)biopsy tissues, pleural fluid samples and brushings of bronchialepithelial cells. These small, non-renewable tissue samples arechallenging to use in gene expression studies. Microarray methods areappropriate for screening thousands of genes potentially involved innumerous cancer phenotypes, however they are unsuitable for FNA geneexpression analysis because of large initial RNA amounts required, lackof internal standards, cost and time (Tyagi, S, and Kramer, F. R.,Nature Biotech. 14: 303-308, 1996; DeRisi, J. L., Science, 278:6860-6866, 1997). After target gene identification, gene expressionanalysis should be further evaluated with a quantitative, standardizedgene expression method.

StaRT-PCR (Standardized Reverse Transcriptase-Polymerase Chain Reaction)is an ideal gene expression method to use in small clinical samples. Itis useful to measure hundreds of genes simultaneously, requires smallamounts of RNA, uses inexpensive equipment, is sensitive, standardizedand highly reproducible (Willey et al., AM. J. Respir. Cell Mol. Biol.19: 6-17, 1998, Crawford et al., Crawford, E. L., Godfridus, J. Peters,Noordhuis, P., Rots, M. G., Vondracek, M., Grafstrom, R. C., Lieuallen,K., Lennon, G., Zahorchak, R. J., Georgeson, M. J., Wali, A., Lechner,J. F., Fan, P—S., Kahaleh, B., Khuder, S. A., Warner, K. A., Weaver, D.A., and Willey, J. C. (2001), Reproducible gene expression measurementamong multiple laboratories obtained in a blinded study usingstandardized RT (StaRT)-PCR, Molecular Diagnosis 6: 217-225, 2001). Itis likely that malignant, chemoresistant and metastatic phenotypesresult from the interactive effects of many genes. Because the data arenumerical in StaRT-PCR studies, phenotypes can be represented byinteractive gene expression indicies (IGEI). Demuth et al., Am. J.Respir. Cell Mol. Biol. 19: 18-24, 1998, reported the gene expressionindex of c-mycxE2F-1/p21 predicted malignancy in human bronchialepithelial cells better than any individual gene measured. In a similarstudy, the gene expression index of mGST×GSTM3×GSHPx×GSHPxA×GSTP1 wassensitive (90%) and 76% specific for detecting normal bronchogenicepithelial cells from subjects with bronchogenic carcinoma (Crawford etal., Cancer Research, 60: 1609-1618, 2000). Specifically, thisinteractive gene expression index identified individuals at risk fordeveloping bronchogenic carcinoma better than any single gene.

The inclusion of standardized, competitive templates in every StaRT-PCRreaction allows direct intra-laboratory and inter-laboratory datacomparison (Willey et al., 1998). Crawford et al., (2001) reported highinter-laboratory reproducibility using StaRT-PCR. (Crawford, E. L.,Godfridus, J. Peters, Noordhuis, P., Rots, M. G., Vondracek, M.,Grafstrom, R. C., Lieuallen, K., Lennon, G., Zahorchak, R. J.,Georgeson, M. J., Wali, A., Lechner, J. F., Fan, P—S., Kahaleh, B.,Khuder, S. A., Warner, K. A., Weaver, D. A., and Willey, J. C. (2001),Reproducible gene expression measurement among multiple laboratoriesobtained in a blinded study using standardized RT (StaRT)-PCR, MolecularDiagnosis 6:217-225, 2001). The generation of standardized, numericaldata is needed for establishing a common, multi-institutional database.A recent modification of StaRT-PCR, termed multiplex standardizedRT-PCR, allows further reduction in the amount of starting materialneeded for gene expression studies (Crawford, E. L., Warner, K. A.,Khuder, S. A., Zahorchak, R. J., and Willey, J. C., Multiplexstandardized RT-PCR for expression analysis of many genes in smallclinical samples, Biochemical and biophysical Research Communications,293: 509-516, 2002). Using multiplex StaRT-PCR at least 96 genes may besimultaneously evaluated using the same amount of cDNA that is normallyused for measurement of one gene. (Crawford, et al. 2002, supra). Thismethod was used to simultaneously measure 18 genes putatively associatedwith chemoresistance in a bronchogenic carcinoma sample obtained by FNA.

This example determines if a high c-mycxE2F-1/p21 gene expression indexcould augment cytopathological diagnosis of bronchogenic carcinoma.Standardized gene expression values for c-myc, E2F-1 and p21 and theinteractive gene malignancy index were determined for eight primary lungFNA samples.

Materials and Methods

Cell Culture

The H1155 human NSCLC cell line was purchased from ATCC (Manassas, Va.),and cultured (37° C., 5.0% CO₂) in RPMI supplemented with gentamicin(0.1%) (Biofluids, Rockville, Md.) and 10% fetal bovine serum (FBS)(Sigma, St. Louis, Mo.).

Evaluation of RNA Preservation, Extraction and Reverse Transcription

H1155 cells (1.0 E6) were placed in Preservcyt (CYTYC/Boxborough,Mass.), RNA-Later (Ambion/Austin, Tex.) or Tri-Reagent (MolecularResearch Center, Cincinnati, Ohio) prior to RNA extraction. Time pointsand temperatures evaluated for RNA quality were 1, 3, 10 and 30 days androom temperature, 4° C. and −20° C. RNA was extracted from cells usingTri Reagent according to manufacturer's protocol. After extraction, RNAquality was evaluated on an Agilent 2100 Bioanalyzer for detection of18s and 28s ribosomal peaks. mRNA samples were reverse transcribed usingM-MLV reverse transcriptase (Gibco BRL, Gaithersburg, Md.) and oligo(dT) primer (Promega, Madison, Wis.) as previously described. (DeMuth,J. P., Jackson, C. M., Weaver, D. A., Crawford, E. L., Durzinsky, D. S.,Durham, S. J., Zaher, A., Phillips, E. R., Khuder, S. A. and Willey, J.C. (1998), The gene expression index of c-mycxE2F-1/p21 is highlypredictive of malignant phenotype in human bronchial epithelial cells,Am. J. Respir. Cell Mol. Biol., 19, 18-24. Crawford, E. L., Khuder, S.A., Durham, S. J., Frampton, M., Utell, M., Thilly, W. G., Waver, D. A.,Ferencak, W. J., Jennings, C. A., Hammersley, J. R., Olson, D. A., andWilley, J. C. (2000), Normal bronchial epithelial cell expression of theglutathione transferase P1, Glutathione transferase M3, and Glutathioneperoxidase is low in subjects with bronchogenic carcinoma, CancerResearch, 60, 1609-1618.)

Uniplex-StaRT-PCR

StaRT-PCR was performed using previously published protocols (Willey, J.C., Crawford, E. L., Jackson, C. M., Weaver, D. A., Hoban, J. C.,Khuder, S. A., DeMuth, J. P. (1998), Expression measurement of manygenes simultaneously by quantitative RT-PCR using standardized mixturesof competitive templates, Am. J. Respir. Cell Mol. Biol., 19, 6-17.DeMuth, J. P., Jackson, C. M., Weaver, D. A., Crawford, E. L.,Durzinsky, D. S., Durham, S. J., Zaher, A., Phillips, E. R., Khuder, S.A., Willey, J. C., (1998), The gene expression index of c-mycxE2F-1/p21is highly predictive of malignant phenotype in human bronchialepithelial cells, Am. J. Respir. Cell Mol. Biol., 19, 18-24. Crawford,E. L., Khuder, S. A., Durham, S. J., Frampton, M., Utell, M., Thilly, W.G., Weaver, D. A., Ferencak, W. J., Jennings, C. A., Hammersley, J. R.,Olson, D. A., Willey, J. C., (2000), Normal bronchial epithelial cellexpression of the glutathione transferase P1, Glutathione transferse M3,and Glutathione peroxidase is low in subjects with bronchogeniccarcinoma, Cancer Research, 60, 1609-1618. Crawford, E. L., Godfridus,J. P., Noordhuis, P., Rots, M. G., Vondracek, M., Grafstrom, R. C.,Lieuallen, K., Lennon, G., Zahorchak, R. J., Georgeson, M. J., Wali, A.,Lechner, J. F., Fan, P—S., Kahaleh, B., Khuder, S. A., Warner, K A.,Weaver, D. A., Willey, J. C., (2001), Reproducible gene expressionmeasurement among multiple laboratories obtained in a blinded studyusing standardized RT (StaRT)-PCR, submitted. Gene Express System 1Instruction Manual, Gene Express National Enterprises, Inc. (2000),www.genexnat.com.) with G.E.N.E. system I expression kit (Gene ExpressNational Enterprises, Inc.).

There were six CT mixtures A-F and appropriate primers included inSystem 1 kit. The concentration of “target gene” CTs varies in each mixcompared to the concentration of the “reference gene” actin. The mastermix contained Rnase-free water, MgC12 buffer, dNTPs, cDNA, CT mixturefrom G.E.N.E. system I kit and taq polymerase. The master mix was placedinto tubes containing individual gene primers, and cycled in aRapidcycler (Idaho Technology, Inc., Idaho Falls, Id.). The denaturingtemperature was 94° C., annealing temperature was 58° C. and elongationtemperature was 72° C. for each cycle. After amplification, each perproduct was analyzed by capillary electrophoresis on an Agilent 2100Bioanalyzer machine. The area under the curve of each native template(NT) was compared to that of its respected competitive template (CT) todetermined gene expression values. The unit for each expression valuewas molecules per 10⁶ β-actin molecules.

Acquisition of Bronchogenic Carcinoma Samples

Fine needle aspirate (FNA) of primary lung cancer were obtained frompatients at the Medical College of Ohio. An informed, signed consent wasobtained from patients according to NIH and institutional guidelinesprior to each procedure. Most cells were placed directly on slides fordiagnostic purposes. Cells not needed for diagnostic purposes werecollected in Preservcyt® Solution (CYTYC/Boxborough, Mass.). After finalcytopathologic diagnosis, remaining cells in Preservcyt were pelleted inour laboratory and RNA was extracted. Cell number and viability wereevaluated on cells through analysis of cells on glass slides.

Results

In an effort to determine optimal collection and preservation of RNA inFNA specimens, H1155 cells (NSCLC) were placed in 3 storage reagents,RNA Later, Preservcyt and Tri Reagent. To determine effects of time andtemperature on RNA, H1155 cells were kept at 4° C. or −20° C. for 1, 3,10 and 30 days.

High quality RNA, indicated as ++(exhibited the presence of 18s and 28sribosomal bands) was detected in HI 155 NSCLC cells stored in eachreagent up to 10 days (FIG. 6—Table 6). RNA was preserved equally wellin Preservcyt and TRI reagent after 30 day storage. RNA was partiallydegraded after 30 day storage in RNA Later (+−) at 4° C. and was notpreserved at −20° C.

To determine if RNA was suitable for StaRT-PCR, it was reversetranscribed and β-actin expression was evaluated. β-actin was detectedin all samples exhibiting high quality or partially degraded RNA (FIG.6—Table 6). As expected, β-actin was not detected in cells stored in RNAlater for 30 days at −20° C. RNA quality correlated highly with theability to be per amplied. Optimal storage reagents for short termstorage (1-10 days) are Preservcyt and RNA later and for long termstorage, greater than 10 days Preservcyt is recommended. Preservcyt isalso advantageous to use at institutions that utilize the Thin PrepSystem for cytological analysis.

After determination of optimal collection and storage conditions, lungFNA specimens were placed in Preservcyt and stored at 4° C. Similar tothe H1155 cells, RNA quality was evaluated in 9 of 10 FNA specimens(FIG. 7—Table 7). Five samples had high quality (++) or partiallydegrated (+−) RNA. As expected, all five samples were per amplifiableand β-actin was detected. One sample was not evaluated (NE) prior toreverse transcription and 4 samples exhibited poor quality RNA(−).β-actin was detected in the NE specimen and unexpectedly was detected in2 of the RNA (−) samples. When high quality RNA is present, it is highlysuitable for PCR experiments. When poor quality RNA is present, it lesslikely to be per amplifiable but still may be useful.

In an attempt to determine why 4 samples had poor quality RNA, thecytological characteristics were determined independently by apathologist for each specimen. Cellularity, viability and percenttumor/normal cells were determined for each sample (FIG. 7—Table 7).Seven of 10 samples had low cellularity (L) and low viability (L). Threeof these samples had a (++) or (+−) RNA status and all were peramplifiable (+β-actin). Of the remaining four samples with lowcellularity and low viability, two were pramplifiable and two were not.Three of 10 samples had intermediate (I) or high (H) cellularity andviability. All three had good quality RNA and were per amplifiable. Itis likely that cellularity is related to the amount of RNA extracted andviability may be related RNA quality obtained from these cells.Specimens and intermediate to high cellularity are optimal for geneexpression studies, but cells with low cellularity and low viability arestill suitable, since 5 of 7 were per amplifiable.

In 7 of 10 samples, the % of tumor cells varied from 60-90%. Two sampleshad 20% tumor cells and one was 10% tumor cells. The FNA diagnosis,determined at time of same acquisition was NSCLC in 6 samples andatypical in 4. To confirm the presence of a malignant phenotype, 3 genesassociated with malignancy, c-myc, E2F-1 and p21 were evaluated in 8 of10 FNA's and the malignancy index of c-mycxE2F-1/p21 was determined(FIG. 8—Table 8). As expected, 5 of the +NSCLC samples had a very highindex value that ranged from 1.0E4 to 3.6E6 (as molecules per 10⁶β-actin molecules). Three of the four atypical samples also exhibitedhigh malignant gene expression indices, with values ranging from7.2E3-5.0E4. After additional analysis, the three atypical samples thatgene expression data was obtained from were later confirmed as smallcell lung cancer (SCLC). The percentage of tumor cells in the atypicalsamples ranged from 20 to 80% indicating even a small number of abnormalcells were sufficient and detected by the gene expression indexc-mycxE2F-1/p21.

While FNA analysis of pulmonary nodules is a common diagnostic method,this is the first example to use a standardized, quantitative geneexpression method on human lung FNA samples. Gene expression profilingof these small, non-renewable cell populations have diagnostic andprognostic implications and lead to individualized patient care.Different gene expression patterns are useful to discriminate betweenSCLC and NSCLC, and earlier identification of a malignant phenotype willoptimize clinical treatment. In addition, StaRT-PCR is also useful toidentify gene expression patterns and associate them with clinicallyrelevant phenotypes, e.g. chemosensitivity and metastatic potential toimprove patient prognosis.

In this example, 5 of the FNA samples initially diagnosed as NSCLC, andlater confirmed to be NSCLC had high index values. The range ofexpression for these+NSCLC specimens were 1.0E4-3.6E6. In this sampleset, 4 were 90.0% tumor cells and sample #172 had only 20% tumor cells,yet had the highest index value, 6.5E5. Three of the FNA samples,cytologically diagnosed as atypical, and later confirmed to be SCLC alsohad high index values. They ranged from 7.20E3-5.0E4 mRNA molecules per10⁵ molecules β-actin mRNA. The percentage of tumor cells in thesesamples ranged from 20-80%.

Other Embodiments

The genes and IGEI marker sets described herein provide valuableinformation for the identification of new drug targets against NSCLC,and that information may be extended for use in the study ofcarcinogenesis in other tissues. These sequences may be used in themethods of the invention or may be used to produce the probes and arraysof the invention.

While the invention has been described with reference to various andpreferred embodiments, it should be understood by those skilled in theart that various changes may be made and equivalents may be substitutedfor elements thereof without departing from the essential scope of theinvention. In addition, many modifications may be made to adapt aparticular situation or material to the teachings of the inventionwithout departing from the essential scope thereof.

Therefore, it is intended that the invention not be limited to theparticular embodiment disclosed herein contemplated for carrying outthis invention, but that the invention will include all embodimentsfalling within the scope of the claims.

What is claimed is:
 1. A method for diagnosing bronchogenic carcinoma in an individual, the method comprising the steps of: a) providing a lung cell and/or tissue sample from the individual; b) determining levels of c-myc, E2F-1, and p21 gene products in the lung cell and/or tissue sample, wherein the levels are determine by microarray analysis, quantitative RT-PCR, or a combination of both, c) utilizing the levels of c-myc, E2F-1, and p21 gene products determined in step b), calculating a gene expression ratio of c-mycxE2F-1/p21; and d) diagnosing bronchogenic carcinoma in an individual when the calculated gene expression ratio is greater than or equal to 7.20E³ (as molecules per 10⁶ β-actin molecules).
 2. The method of claim 1, wherein the lung cell and/or tissue sample is collected by fine needle biopsy.
 3. The method of claim 1, wherein the levels of mRNA for each of c-myc, E2F-1, and p21 in the lung cell and/or tissue sample is determined by StaRT-PCR.
 4. The method of claim 1, wherein the individual is diagnosed as having bronchogenic carcinoma when the calculated gene expression ratio is greater than or equal to 1.0E⁴ (as molecules per 10⁶ β-actin molecules).
 5. The method of claim 1, wherein the individual is diagnosed as having bronchogenic carcinoma when the calculated gene expression ratio is 7.20E³-3.6E⁶ (as molecules per 10⁶ β-actin molecules).
 6. The method of claim 1, wherein the individual is diagnosed as having bronchogenic carcinoma when the calculated gene expression ratio is 1.0E⁴-3.6E⁶ (as molecules per 10⁶ β-actin molecules).
 7. A method for diagnosing bronchogenic carcinoma in an individual, the method comprising the steps of: a) providing a lung cell and/or tissue sample from the individual; b) determining levels of c-myc, E2F-1, and p21 gene products in the lung cell and/or tissue sample, wherein the levels are determine by microarray analysis, quantitative RT-PCR, or a combination of both, c) utilizing the levels of c-myc, E2F-1, and p21 gene products determined in step b), calculating a gene expression ratio of c-mycxE2F-1/p21; d) comparing the calculated gene expression to a control gene expression ratio; and e) diagnosing bronchogenic carcinoma in an individual when the calculated gene expression ratio is greater than the control gene expression ratio.
 8. The method of claim 7, wherein the control gene expression ratio is determined by: a) providing a lung cell and/or tissue sample from a healthy, cancer-free individual; b) determining levels of c-myc, E2F-1, and p21 gene products in the lung cell and/or tissue sample, wherein the levels are determine by microarray analysis, quantitative RT-PCR, or a combination of both; and c) utilizing the levels of c-myc, E2F-1, and p21 gene products determined in step b), calculating a control gene expression ratio of c-mycxE2F-1/p21.
 9. The method of claim 8, wherein the control gene expression ratio is determined by: a) providing a lung cell and/or tissue sample from two or more healthy, cancer-free individuals; b) determining levels of c-myc, E2F-1, and p21 gene products in the lung cell and/or tissue sample for each cancer-free individual, wherein the levels are determine by microarray analysis, quantitative RT-PCR, or a combination of both; c) utilizing the levels of c-myc, E2F-1, and p21 gene products determined in step b), calculating a gene expression ratio of c-mycxE2F-1/p21 for each cancer-free individual; and d) determining a control gene expression ratio by calculating a mean for the gene expression ratios of the two or more healthy, cancer-free individuals.
 10. A method of detecting the progression of bronchogenic carcinoma in an individual, comprising the steps of: a) providing a first lung cell and/or tissue sample from the individual, wherein the first lung cell and/or tissue sample is collected a first point in time; b) determining levels of c-myc, E2F-1, and p21 gene products in the first lung cell and/or tissue sample, wherein the levels are determine by microarray analysis, quantitative RT-PCR, or a combination of both, c) utilizing the levels of c-myc, E2F-1, and p21 gene products determined in step b), calculating a first gene expression ratio of c-mycxE2F-1/p21; d) providing a second lung cell and/or tissue sample from the individual, wherein the second lung cell and/or tissue sample is collected a second point in time, later than the first point in time; e) determining levels of c-myc, E2F-1, and p21 gene products in the second lung cell and/or tissue sample, wherein the levels are determine by microarray analysis, quantitative RT-PCR, or a combination of both; and f) utilizing the levels of c-myc, E2F-1, and p21 gene products determined in step e), calculating a second gene expression ratio of c-mycxE2F-1/p21, wherein the second gene expression ratio being higher than the first gene expression ratio is indicative of progression of bronchogenic carcinoma in the individual.
 11. The method of claim 10, wherein both the first and second lung cell and/or tissue samples is collected by fine needle biopsy.
 12. The method of claim 10, wherein the levels of mRNA for each of c-myc, E2F-1, and p21 in both the first and second lung cell and/or tissue samples is determined by StaRT-PCR.
 13. A method of diagnosing bronchogenic carcinoma in an individual, comprising detecting and quantifying the level of expression of c-myc, E2F-1 and p21 genes in a lung cell and/or tissue sample from the individual, wherein differential expression of the c-myc, E2F-1 and p21 genes is diagnostic of bronchogenic carcinoma in the individual.
 14. The method of claim 13, wherein the tissue sample is collected by fine need biopsy.
 15. The method of claim 13, wherein the steps of detecting and quantifying are carried out by microarray analysis, quantitative RT-PCR, or a combination of both.
 16. The method of claim 13, wherein the type of RT-PCR employed is StarRT-PCR.
 17. A method of detecting the progression of bronchogenic carcinoma in an individual, comprising detecting and quantifying the level of expression of c-myc, E2F-1 and p21 genes in a lung cell and/or tissue sample from the individual, wherein differential expression of the c-myc, E2F-1 and p21 genes is indicative of bronchogenic carcinoma progression in the individual.
 18. The method of claim 17, wherein the tissue sample is collected by fine need biopsy.
 19. The method of claim 17, wherein the steps of detecting and quantifying are carried out by microarray analysis, quantitative RT-PCR, or a combination of both.
 20. The method of claim 17, wherein the type of RT-PCR employed is StarRT-PCR.
 21. A method of monitoring treatment of bronchogenic carcinoma in an individual, comprising the steps of: a) administering one or more pharmaceutical compositions to the individual; b) providing a first lung cell and/or tissue sample from the individual, wherein the first lung cell and/or tissue sample is collected a first point in time; c) determining levels of c-myc, E2F-1, and p21 gene products in the first lung cell and/or tissue sample, wherein the levels are determine by microarray analysis, quantitative RT-PCR, or a combination of both, d) utilizing the levels of c-myc, E2F-1, and p21 gene products determined in step c), calculating a first gene expression ratio of c-mycxE2F-1/p21; e) providing a second lung cell and/or tissue sample from the individual, wherein the second lung cell and/or tissue sample is collected a second point in time, later than the first point in time; f) determining levels of c-myc, E2F-1, and p21 gene products in the second lung cell and/or tissue sample, wherein the levels are determine by microarray analysis, quantitative RT-PCR, or a combination of both; and g) utilizing the levels of c-myc, E2F-1, and p21 gene products determined in step f), calculating a second gene expression ratio of c-mycxE2F-1/p21, wherein the second gene expression ratio being higher than the first gene expression ratio is indicative of progression of bronchogenic carcinoma in the individual, and the second gene expression ratio being lower than the first gene expression ratio is indicative of improvement of bronchogenic carcinoma.
 22. A method of monitoring treatment of bronchogenic carcinoma in an individual, comprising the steps of: a) administering one or more pharmaceutical compositions to the individual; b) providing a lung cell and/or tissue sample from the individual; c) determining levels of c-myc, E2F-1, and p21 gene products in the lung cell and/or tissue sample, wherein the levels are determine by microarray analysis, quantitative RT-PCR, or a combination of both, d) utilizing the levels of c-myc, E2F-1, and p21 gene products determined in step b), calculating a gene expression ratio of c-mycxE2F-1/p21; and e) comparing the calculated gene expression to a control gene expression ratio, wherein the calculated gene expression ratio being higher than the control gene expression ratio is indicative of a need to continue treatment of the individual.
 23. The method of claim 22, wherein the method is repeated one or more times during treatment of the individual.
 24. The method of claim 22, wherein the control gene expression ratio is determined by: a) providing a lung cell and/or tissue sample from a healthy, cancer-free individual; b) determining levels of c-myc, E2F-1, and p21 gene products in the lung cell and/or tissue sample, wherein the levels are determine by microarray analysis, quantitative RT-PCR, or a combination of both; and c) utilizing the levels of c-myc, E2F-1, and p21 gene products determined in step b), calculating a control gene expression ratio of c-mycxE2F-1/p21.
 25. The method of claim 22, wherein the control gene expression ratio is determined by: a) providing a lung cell and/or tissue sample from two or more healthy, cancer-free individuals; b) determining levels of c-myc, E2F-1, and p21 gene products in the lung cell and/or tissue sample for each cancer-free individual, wherein the levels are determine by microarray analysis, quantitative RT-PCR, or a combination of both; c) utilizing the levels of c-myc, E2F-1, and p21 gene products determined in step b), calculating a gene expression ratio of c-mycxE2F-1/p21 for each cancer-free individual; and d) determining a control gene expression ratio by calculating a mean for the gene expression ratios of the two or more healthy, cancer-free individuals. 